The Agent That Deleted Production
What Actually Happened
In December 2025, Amazon's AI coding agent Kiro was assigned a routine task: fix a minor issue in AWS Cost Explorer within the Beijing region. The agent analyzed the problem and determined that the most efficient resolution was to delete the production environment and recreate it from scratch.
The result was a thirteen-hour outage across an AWS China region. Amazon's official response attributed the failure to "user error — specifically misconfigured access controls — not AI." Four anonymous sources inside Amazon told the Financial Times a different version of events.
Weeks before the incident, senior VPs had issued what employees called the "Kiro Mandate," requiring 80% weekly usage across development teams. The system operated with operator-level permissions. There was no mandatory peer review before changes went to production. There was no human-in-the-loop checkpoint before destructive actions — the kind that deletes an environment and starts over.
A second incident followed shortly after, involving Amazon Q Developer, under nearly identical circumstances.
The Permissions Problem
I've been talking to our senior engineers about this, and their reaction is consistent: every one of them has a visceral response, and none of them are surprised.
If you've managed production infrastructure for any length of time, you've internalized a set of rules about access control that become second nature. You don't give a deployment pipeline unrestricted write access to production databases. You don't let automated systems execute destructive operations without a review gate. You scope permissions to the minimum required for the task. You treat secrets — credentials, API keys, certificates — as sensitive assets with their own access controls and rotation policies.
These principles aren't theoretical. They're the result of decades of real incidents where someone (or something) with too much access did exactly what Kiro did: found the shortest path to a solution without understanding which paths were off limits.
Kiro didn't malfunction. It did precisely what it was designed to do. The failure was that nobody applied basic infrastructure discipline to the permissions model. The agent had operator-level access because someone configured it that way, and nobody built the guardrails that would have caught a "delete production" command before it executed.
Why This Keeps Happening
74% of organizations are planning agentic AI deployments within the next two years. Only 21% have what Gartner considers mature governance for autonomous systems. Nearly half — 47% — of organizations that have already deployed AI agents report observing unintended or unauthorized behavior.
These numbers describe a specific gap, and it's worth understanding the shape of it. The teams deploying AI agents don't typically come from infrastructure or DevOps backgrounds. They're ML engineers, data scientists, and application developers who are brilliant at model training and prompt design but haven't spent a decade thinking about production access patterns, blast radius containment, or what happens when an automated process has more permissions than it needs.
There's a parallel to the early days of cloud adoption here. When companies first started moving workloads to AWS and Azure, they gave application teams direct access to provision resources. Those teams built what they needed — and accidentally left S3 buckets open to the internet, provisioned databases without encryption, and created IAM roles with wildcard permissions because that was the fastest path to a working demo. It took years and a steady stream of breaches before "cloud security" became a discipline unto itself, with infrastructure teams reviewing every deployment before it went live. We're at the same inflection point with agentic AI, except the consequences move faster because agents act autonomously.
In previous articles, we described the forward-deployed team model — senior engineers embedded in your operations, building systems grounded in your actual constraints. One of the things we didn't emphasize enough is that our engineers carry deep DevOps and infrastructure experience. That experience is what makes them think about permissions boundaries, sandboxed execution, and secrets management before an agent gets anywhere near production data. Not because they read a governance checklist, but because they've been burned by the same class of problem enough times that the discipline is automatic.
If this can happen at Amazon — the company that built the cloud infrastructure the rest of us run on — what makes you think your organization is immune? Amazon doesn't lack engineering talent. They don't lack infrastructure expertise. And it still happened. The question every CTO should be asking right now isn't "could this happen to us?" It's "what's stopping it?"
The Agentic AI Governance Deficit
Gartner projects that over 40% of agentic AI initiatives will be scrapped by 2027 — not because the AI doesn't work, but because of rising costs, unclear value, and inadequate risk controls. The technical capability is ahead of the governance architecture.
We're seeing this firsthand with clients. Organizations that have been running AI experiments in sandboxes for a year are now pushing to deploy agents in production. The conversation shifts from "can we build this?" to "how do we deploy this safely?" — and the teams doing the building often don't have a good answer.
The common response is to write a governance framework. An AI ethics committee produces a document. A risk assessment gets filed. Someone creates a permissions policy that describes what agents should and shouldn't be able to do. The document goes into a SharePoint folder and gets reviewed quarterly.
Meanwhile, the agents are running in production with whatever permissions they were given during development, because nobody translated the policy into enforcement.
What Engineering-Grade Governance Looks Like
This is where we spend a lot of our engineering time, and it's where Fabric — our purpose-built AI infrastructure — came from.
The first engineering decision — before you write a single line of agent code — is identifying which integration points in your infrastructure require deterministic behavior and which ones benefit from adaptive reasoning. A delete command on a production environment, a funds transfer, a permissions change — these need to behave 100% predictably, with guardrails that enforce strict adherence at any cost. An agent analyzing anomalies in your operational data, synthesizing patterns across contracts, reasoning about edge cases your rules engine hasn't seen before — that adaptive behavior is a feature, not a design flaw. The Kiro incident happened because nobody made this distinction. The agent's adaptive reasoning was applied to a task that demanded deterministic control.
Fabric was designed from the ground up around this principle and around a broader one that our engineers consider obvious but that most AI deployments ignore: AI agents are workloads. They need the same infrastructure discipline as any other automated system that touches production data and production systems.
Sandboxed code generation. When an AI agent generates and executes code in your environment, that execution happens in a sandboxed context with explicit resource boundaries. The agent can't reach systems or data outside its scope because the infrastructure physically prevents it — not because a policy says it shouldn't.
Scoped permissions. Every agent operates under a permissions model that mirrors the least-privilege principle from infrastructure security. Access to specific datasets, specific APIs, specific operations. Destructive operations require explicit approval gates. The agent can't escalate its own permissions.
Private secrets management. Credentials, API keys, certificates, and tokens stay inside your infrastructure. They're managed through the same secrets management patterns that DevOps teams have been using for production systems for a decade — scoped access, rotation policies, audit trails. They never flow through third-party systems or leave your perimeter.
Local MCPs. Your data stays sovereign. Model Context Protocols run locally in your environment, keeping the context your agents work with — your business logic, your operational data, your customer information — under your control and your compliance framework.
Runtime enforcement. The governance isn't a quarterly review or a dashboard someone checks periodically. It's enforced at runtime, continuously. Every agent action is logged, every permission boundary is active, every scope constraint is enforced in real time. The governance architecture and the production architecture are the same thing.
None of these are novel engineering concepts. Every principle here is borrowed directly from how senior DevOps engineers have managed production infrastructure for years. The difference is that most AI deployments skip this step entirely because the teams doing the building don't think in infrastructure terms.
The Team Problem Behind the Governance Problem
In our last article, we described how generic copilots fail because they don't solve the operational problems that actually matter — the 20% that's specific to your business. The governance version of that same problem: generic governance frameworks fail because they're written by people who understand compliance but not infrastructure.
An AI governance document that says "agents should operate with least-privilege access" is about as useful as a security policy that says "employees should use strong passwords." Both are true. Neither tells you how to enforce it.
Enforcement requires engineers who understand both the AI system and the infrastructure it runs on. Engineers who know how to design permission boundaries at the infrastructure level, not just describe them in a document. Engineers who've built production systems with secrets management, blast radius containment, and automated rollback — and who apply that same rigor to AI agent deployments.
This is why we staff our teams with senior engineers who've spent years in DevOps and infrastructure roles before working on AI systems. When one of our engineers designs an agentic deployment, the permissions architecture is the first conversation, not an afterthought. What can the agent access? What can it modify? What requires human approval? What happens when it attempts something outside its scope? These aren't governance checkboxes. They're engineering decisions that get baked into the infrastructure on day one.
The Kiro-style failure mode — an agent with too much access doing something destructive because nobody built the guardrails — is a solved problem if you have the right people building the system. It's a catastrophic problem if you don't.
Where This Goes
The agentic AI wave isn't slowing down. 74% of organizations are committed to deployment timelines. The economic incentives are real — autonomous agents that can handle complex multi-step workflows represent a genuine productivity shift. We're building agentic systems for our clients right now, and when the governance architecture is right, the results are remarkable. The agents move fast precisely because the boundaries are clear. The sandbox gives them freedom to iterate within a safe perimeter, which means our engineers spend their time refining the agent's judgment rather than cleaning up the damage from an unconstrained one.
But the gap between "the agent works in the sandbox" and "the agent is safe in production" is as wide as it's ever been. The teams that close that gap will be the ones with engineers who think about infrastructure security as reflexively as they think about model accuracy. The Kiro story doesn't end with Amazon — it's a preview of what's coming for every organization that deploys agents with the permissions architecture they used in their research environment.
And one more thing worth paying attention to: while enterprises are still debating governance frameworks, tools like OpenClaw are giving startups and small teams the ability to deploy autonomous AI agents with almost no friction. Yes, these tools are immature — Cisco's security team found data exfiltration and prompt injection in third-party OpenClaw skills just weeks ago — and the skill repositories lack adequate vetting. They're dangerous in the ways you'd expect from early-stage, move-fast tooling. But remember the iPhone story from our last article. These tools will harden quickly. The security will mature. The ecosystems will professionalize. And while your organization is still running governance committee meetings, the companies in your rearview mirror — or the ones not even on the road yet — are building with their pedal to the metal. The governance work you do now isn't just about preventing the next Kiro incident. It's about making sure you can move as fast as the market demands when the tooling catches up.
In our next piece, we'll go deeper on what it means to ship governance that actually works — the regulatory deadlines that are making this urgent, and why the organizations getting it right are treating governance as something you build into architecture rather than something you write into documents.
Want to pressure-test your own systems against this? Schedule a conversation with an engineer who builds this way.