![]()
On 5 January 2025, OpenAI CEO Sam Altman outlined his vision for 2025 in a post on his personal blog. In it, Altman proclaimed that “in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.” His remarks set the tenor for the AI industry through 2025.
But did AI agents actually join the workforce in 2025? The answer is yes, absolutely—or no, not at all. It depends on who you ask.
Michael Hannecke, a sovereign AI and security consultant at Bluetuple.ai, says that “everyone” is looking into how to use AI agents. “But there’s also a kind of disillusionment. It’s not that easy. You don’t just throw AI at anything, and it just works.”
While many industries have expressed interest in AI agents, programmers and software engineers have seemingly leapt towards the front of the pack. Brandon Clark, senior director of product and engineering at Digital Trends Media Group, shares this enthusiasm. He has fully moved his work into AI tools and now trusts the capabilities of AI agents in most situations.
“I use Cursor as my daily driver to develop code,” says Clark. He also frequently uses Anthropic’s Claude Code, bouncing between Cursor and Claude, not only because he has some preference for the tasks each are best at solving, but also to get around usage caps—which gives an indication of just how frequently he uses agents. “Sometimes I run out of tokens on Claude Code […] at that point I’ll switch back over to Cursor and just continue my work.”
As with many programmers, Clark’s willingness to use agents is in part because of his background. Clark has years of experience working with integrated development environment (IDE) software. An AI-infused IDE, such as Cursor, presents agentic AI in a way that plugs into existing tools and workflows with relative ease.
His quick adoption also shows the ways AI agents are well equipped to handle some software engineering tasks. For example, tests are code used to verify that software is operating properly by testing its function against inputs and outputs known to be correct. Tests are important but repetitive and don’t often require novel thinking to implement, which makes them easier for AI agents to handle.
“It’s at the point where I don’t even need to be involved. As part of the [AI] system instructions, I say that any time it writes a new feature, make sure to also write tests for it. And while you’re at it, run the tests, and if anything breaks, fix it,” Clark says.
Programmers have also felt empowered by the invention of new ways to integrate AI across software, such as Anthropic’s Model Context Protocol (MCP) servers (introduced in November 2024) and Google’s Agent2Agent protocol (introduced in April 2025). These allow agents to call on software to complete or verify their work. For example, Cursor has browser tools that can be called as an MCP server. An agent programming for the web can use it to check the results of its work.
Other AI agents are easy to imagine, but tough to deploy
For Clark, 2025 truly was the year of AI agents. He was able to experiment with them early in the year, and the results he saw only improved as better models were released and AI-focused coding tools improved. Others have had a more mixed experience.
Hannecke, an AI consultant based in Germany, saw no shortage of interest in AI agents through 2025. Yet when it came time to think more seriously about deployment, organizations often ran into trouble.
“I have only seen three or four use cases where companies have [AI agents] in production,” says Hannecke. “Most others are still in a development phase, still evaluating, still testing, due to the insecurities that can come with it.” He says many organizations react with a degree of “German angst” over the risks that come with AI automation. “There are a lot of things we’re not quite 100 percent sure about with AI agents.”
Some people believe that agents are productivity boosters with little to no downside; others see them as promising but early technology; still others see them as fundamentally dangerous.
German and European regulations contribute to this reaction, to be sure, but they’re not the only reason for caution. Jason Bejot, senior manager of experience design at Autodesk, which makes 3D design software, articulated a concern that will be relatable to engineers across many fields: accountability.
“That’s one of the big challenges. […] How do I actually get it to work, to make it precise, so that I can get it built?” Bejot asks.
Autodesk has an agentic AI tool, Assistant, that can field questions from users of Autodesk software including AutoCAD, Autodesk Fusion, and Revit. However, as it exists today, the assistant is largely designed to be only that—an assistant. It can summarize information and provide guidance, but it’s not meant to take the reins and engineer a solution autonomously.
“You need to be able to have a clear through-line. If architect A has updated their sketches using the assistant, that person is still accountable for those updates,” says Bejot. “So, how do you create that level of accountability across the board? It’s something we’re very conscious of.”
Bridging the gap between agents and accountability
The varying experiences of Clark, Bejot, and Hannecke underscore the wide range of outcomes from AI agents through 2025 and into 2026. For some, agents are already working as Altman speculated. For others, there’s a lot more work to be done before agents can deliver.
Kiana Jafari, a postdoctoral researcher at Stanford University, has studied this gap. She co-authored a paper that found technical metrics like accuracy and task completion dominate 83 percent of AI agent assessments. These are metrics that can be verified and systematized, reflecting Clark’s experience as a programmer.
However, technical accuracy isn’t the only metric worth attention. “Most of the agentic systems that we are working with right now are in theory doing very well in terms of accuracy,” says Jafari. “But when it comes down to people using it, there are a lot of hurdles.”
In fields where professionals bear personal responsibility for outcomes, even AI agents that achieve a high standard of technical accuracy may not perform well enough. Jafari’s interviews with medical professionals have made it clear why this is the case. “What they all say is, ‘If there is a 0.001 percent chance that this could make mistakes, that is still my name. That is on me if it’s wrong.’” This can result in AI agents backsliding from an active to advisory role.
This can help explain the extreme divergence in the reception of AI agents. Some people believe that agents are productivity boosters with little to no downside; others see them as promising but early technology; still others see them as fundamentally dangerous. The reality is that AI agents can be all of these things, depending on the task they’re set to solve.
“There’s still the need for the human in the loop,” says Hannecke. “2025 was a lot of ‘let’s play with it, let’s prototype it.’ 2026 will be the year we put it into production, and find out what will be the difficulties we have to deal with when we scale it.”
From Your Site Articles
Related Articles Around the Web








