What Makes Agents Different? -- Part 1
Agentic software represents a change in scale, not a change in shape, for use of your infrastructure. A change in scale, however, can be a change in principle. Humans and microservices can be effective analogies here. We must consider what happens upon stretching them to be vastly more distributed, dynamic, transient, and stateful.
So, let’s consider these questions:
What would break if your organization grew 1000x tomorrow?
Consider replacing any given person at your company with an AI Agent today. What would you no longer trust them to do that you did yesterday?
If any employee at your company could deploy a new microservice to prod with zero review or notice, how would you prevent that service from harming other systems?
Agents are cheap, really cheap
What would break if your organization grew 1000x tomorrow?
Comparing the balance-sheet denting requests to LLM-backed agents to the $0 of a marginal microservice request is... comparing beasts of different categories. More accurate is to consider how long a human would have taken for the same task. $0.30 to write a first “working” attempt at a business intelligence query? Compare that to an hour reading documentation + the trial and error execution to get it running! That's an incredible cost reduction. One that impacts any employee who wants to perform such queries. It even has the potential to unlock any employee, not just data scientists, leveraging BI knowledge.
It doesn't matter if it's imperfect, there are already plenty of tasks for which an LLM-backed workflow meaningfully accelerates human labor. $0.50 or even $5 vs. $50 represents a very large consumer surplus…. Employees are going to use these tools, whether management approves of it or not. The value to them, individually, can be just that great.
Reducing your workforce in the face of agentic automation is short-sighted…
A natural first consideration for management is “oh, we can reduce our workforce and save a ton!” That, however, may be short-sighted. It only makes sense if you aren't growing and are prioritizing cost reduction.
Automation like this creates three regimes of hiring strategy:
Fixed product – shrink workforce
Product vision / expansion constraints (outside of labor) are smaller than the returns on investment in agentic technology – workforce becomes fixed when it was previously expanding
Product vision / expansion outstrips returns from agentic technology – expand workforce
Any FAANG-like company or startup fits into regime #3. They're constrained by time and resources, not ambition. To deliver on that, time and time again management leans on expanding their workforce.
For the immediate future, this may not mean changing hiring strategy. It could mean any number of options like:
handing off automatable tasks to agentic software
building workflows to scale individuals beyond what was previously possible
or letting individuals perform tasks previously requiring whole teams (e.g. developing and deploying entire applications).
We've yet to achieve the value current LLMs offer to bridge the human-computer divide. The average employee will get more productive as we get better at agent wrangling. Frontier models aren’t fixed, however, and there are new technologies coming down the pike. Improving models and agentic tooling will give us two domains of leverage in this new world.
If people are suddenly spinning up 10s or 100s of agents a person, how will we handle the organizational infrastructure strain? Will every agent need to be onboarded? What will it mean to delegate access? Imagine if each novel agent required the same onboarding as a new employee… your company’s Jira-based processes would implode under their own weight.
If agents are dirt cheap relative to people, we’re going to end up with a lot of them. How will you best leverage them in your organization? How will your organization have to change to accommodate them?
Any but not all
Consider replacing any given person at your company with an AI Agent today. What would you no longer trust them to do that you did yesterday?
Humans frequently complete novel tasks they’d never seen before in their organization. This isn’t, classically, something services do.
Maybe they’re on call and need to debug a system outage they’ve never considered before, accessing logs and systems from disparate teams.
Maybe they realized a problem they’re working on had already been solved elsewhere and needed to use the existing system to accelerate current tasks.
Maybe they’ve been assigned a new project requiring access to sensitive data they’ve never touched.
On the human scale, it’s not unspeakable (and in many organizations is expected) for novel jobs to require filing tickets, getting training, and meeting with other people; novel work often requires significant time investment within a human organization. Further, novel work could affect any part of a human organization. You might even trust someone to gain all the access necessary for any task long before they need to perform it. Why do we trust humans so much? There’s a social fabric within the enterprise. There’s duty, guilt, shame, training, etc. that leads us to, in general, believe the risk is low that someone will do bad things. We tune this risk tolerance based on the domain and context, increasing/decreasing requirements for training, tickets, etc. as we see necessary. The biggest knob we tune is to add time in the way of doing bad things. The more time necessary, the harder it will be to do a lot of damage before someone notices and it can be stopped.
Now consider this same novelty, but remove all that social fabric. How can you build trust back into a system where the actor you’re trusting doesn’t have shame, can’t predictably be trained, and has no sense of duty or fear of being fired?
Speeding up low-trust task execution removes our biggest control knob to reduce risk exposure: adding time.
To make matters worse, what if the whole point of replacing the human in the scenario is to save time? That removes our biggest control knob…. How can we accelerate access to things and increase trust and decrease potential damage? It’s a very constraining universe to consider but one that’s very real in considering adopting AI agents.
A good scalable option to solve this problem: locally constrain the scope of action. We need authorization systems to respond in real-time to the demands of access and action. With leverage, we can give exactly the right abilities to exactly who/what needs it for only the task to be performed. Further, we need to record a deep audit trail tying together intention, authorization, and a full log of everything the agent does. Directly recording what matters, instead of trying to rebuild it from a haystack of metrics and disparate logs, let's us demonstrate that nothing bad happened in the success case. In failure, we can know how bad things happened… so we can improve for the next time.
We need to build the infrastructure to give agents access to anything without having to give them access to everything all at once. Couldn’t this reduce friction in the human organization as well? This type of system has been demonstrated to reduce time to access and number of bureaucratic steps at companies like Google and Facebook/Meta, why not everywhere else?
How will we move to this new just-in-time world, one we’ve wanted for a very long time, when it’s eluded us so far?
YOLO
If any employee at your company could deploy a new microservice to prod with zero review or notice, how would you prevent that service from harming other systems?
Traditionally, we reduce the risk of new software being deployed by
requiring multiple people to approve, catching issues that one person might overlook
testing functionality, ensuring it does what we expect it to do (and continues operating as we expect, even in the face of change)
delaying deployment and letting things run in representative environments, to make sure its integrative behavior reflects that expected from the “lab conditions” of review and testing
What if a machine wrote the code, wrote the tests, did code review, or did the evaluations? What if we wanted to let any employee, engineer or not, go from idea to demonstration without any roadblocks or friction?
This fundamentally breaks common practice around the software development life cycle. In the extreme case, the first time a human is involved is requesting functionality, but the first time that same human evaluates correctness and expectations in earnest might be when they try to use the live service!
A key point to note here is that, while people may “watch” the development happening, if they aren’t trained to A) review code, B) look for critical failures, or C) understand the implications of the underlying technical solution, they very well may not be able to recognize (much less correct) failure until the system is live! Even an experienced engineer might be more lax or spread thin in a world where thousands of lines of code “appear instantly” and must be considered. Will engineers feel good about spending two days to review code written in 2 minutes?
Trust in production is going down, how do we fix it?
The reality is that trust in production, all else kept constant, is going to plummet. How do we repair this bursting dam?
There are many dimensions one could control around the tools used for development; more statically verifiable languages and frameworks, better specifications around valid behavior, more comprehensive (and human understandable) testing solutions, etc. can all shore up some trust. The reality, though, is that A) people will want to continue using the tools they already know, and B) these LLMs best know the tools already best documented.
That means we can’t rely exclusively on better development tools. We need better controls in production. Better authorization management, better observability, easier rollback: more intelligent production management in general.
More intelligent production management probably means better integrations between systems like observability and authorization. Knowing not only what is happening, but who is doing it, and why it was authorized, are going to be essential for root-cause analysis and blast-radius reduction.
How are you going to shore up your dam?
Conclusion
Agentic software represents a dramatic acceleration in scale for software to take action. This will affect every organization. Software ate the world, and now we’re giving it agency…. We need something which keeps us all sane, safe, secure, and makes us confident in using them. We need some way to build a social fabric for agents.

