AI Agent Governance: How to Control

AI Agent Governance: How to Control Autonomous Agents in Production

A Fortune 500 company learned the hard way what happens when AI agent governance is an afterthought. One of their autonomous agents — running in production, with access to real infrastructure — dropped a database table. The guardrail that was supposed to prevent destructive operations? A hardcoded if-statement that checked for the string "DROP TABLE" in SQL output. The agent used a different tool call to accomplish the same thing, bypassed the check entirely, and SREs were pulling the plug at 3am on a Saturday.

This isn't a hypothetical from a safety research paper. It was reported by Galileo's CTO Yash Sheth in the blog post announcing Agent Control as a real incident that motivated the platform's design. And it's not an isolated case — in February 2026, Meta's Director of AI Safety and Alignment, Summer Yue, had to scramble to stop an AI agent from deleting her entire inbox after connecting it to her email. The incident, reported by PCMag, Fast Company, and Business Insider, prompted broader discussion about agent access controls in enterprise environments.

These incidents share a pattern: agents with real-world access, operating under governance models designed for a simpler era. String-matching guardrails. Coarse permission scopes. No runtime monitoring. The agents outgrew the controls, and the controls broke in production.

If you're building or deploying AI agents today, AI agent governance isn't a compliance checkbox — it's the engineering discipline that determines whether your agents are assets or liabilities.

Why AI Agent Governance Matters Now

Three things converged in early 2026 that moved agent governance from "nice to have" to "board-level concern."

Governments are publishing frameworks. In January 2026, Singapore released the Model AI Governance Framework for Agentic AI, outlining four governance dimensions: risk assessment, accountability, transparency, and oversight. Google followed with a responsible AI progress report that included a governance blueprint for autonomous systems. These aren't aspirational whitepapers — they signal that regulatory expectations for agent governance are forming now, and enterprises that ignore them will be playing catch-up. The market formalized. Forrester defined the "Agent Control Plane" as a formal market category and announced evaluations of vendors in the space. When an analyst firm defines a category, it means enterprise buyers are actively asking for it, procurement teams are writing RFPs around it, and the space is no longer experimental. Agent sprawl is real. The same dynamic that created "shadow IT" a decade ago is happening with agents. Teams spin up autonomous agents for customer support, data processing, code review, and internal operations. Many of these agents operate without centralized oversight, without consistent permission models, and without anyone knowing the full inventory of what's running. As Zenity's governance checklist for CISOs puts it: "Treat every agent as production-impacting unless proven otherwise."

The convergence of regulatory pressure, market maturation, and uncontrolled agent proliferation means the window for building governance in after the fact is closing. The organizations that embed it now will move faster later — not slower.

The Five Pillars of an Agent Governance Framework

Effective agent governance isn't a single tool or a single policy. Based on enterprise frameworks from Zenity, Mayer Brown's legal analysis, and the Singapore governance model, five pillars consistently appear across mature governance implementations.

1. Agent Discovery and Inventory

You can't govern what you can't see. The first pillar is knowing what agents exist, what they do, and where they operate.

This means maintaining a living inventory of every agent in production: what model it uses, what tools it has access to, what data it can read and write, who owns it, and when it was last reviewed. Shadow agents — the ones a team spun up on a Friday afternoon and forgot about — are the new shadow IT. They're the ones most likely to cause incidents because they were never designed with governance in mind.

Practically, this requires agent registration as part of your deployment pipeline. No agent reaches production without an inventory entry. Tools like AgentOps provide agent session tracking that can feed into a centralized inventory, giving you visibility into what's running and how it's behaving.

2. Identity and Access Control

Every agent needs an identity — not a shared service account, not an API key pasted into environment variables, but a scoped identity with explicit permissions.

The principle is the same as human IAM (Identity and Access Management), adapted for autonomous systems. An agent that processes customer inquiries doesn't need write access to your billing database. An agent that generates reports doesn't need the ability to send emails. Least privilege isn't new, but enforcing it on agents that can dynamically decide what tools to call requires tighter controls than traditional role-based access.

The practical challenge is that agents don't follow static permission paths the way human users do. A customer support agent might decide mid-conversation that it needs to look up billing data, then escalate to a refund tool, then send a confirmation email — each step requiring different permissions. Static role-based access can't handle this. You need dynamic, per-action authorization that evaluates what the agent is trying to do at each step, not just what role it was assigned at deployment.

3. Runtime Guardrails

This is where the Fortune 500 incident broke down. A hardcoded string check is not a guardrail — it's a filter with one rule, and agents will find the path around one rule every time.

Runtime guardrails operate during agent execution, evaluating actions in real time against policies before those actions hit production systems. The distinction from pre-deployment testing is critical: pre-deployment testing catches the failures you anticipated. Runtime guardrails catch the failures you didn't.

Galileo's Agent Control platform, launched as open-source under Apache 2.0 on March 11, 2026, was designed specifically for this pillar. It uses a @control() decorator pattern that wraps agent actions in policy checks, with policies decoupled from application code. The architecture means compliance teams can update governance rules without requiring engineering to redeploy the agent — a separation that matters enormously in regulated environments.

Frameworks like Griptape build guardrails directly into the agent framework itself, providing structure-aware controls that go beyond string matching. ControlFlow takes a different approach, using structured workflow definitions that constrain what agents can do at each step of a task — governance through architecture rather than bolt-on filters.

4. Policy Management

Policies need to live outside agent code. This is the lesson from the Fortune 500 incident that goes beyond "use better guardrails" — even good guardrails fail if they're embedded in application logic that only engineers can update.

Policy-as-code is the pattern: governance rules defined in a declarative format, version-controlled, auditable, and updateable by compliance and security teams without touching the agent's codebase. When a new regulation lands or a new risk category emerges, the policy layer adapts without redeployment.

Galileo's Agent Control implements this with a pluggable evaluator architecture — policies are modules that can be added, removed, or updated independently. This mirrors how infrastructure-as-code (Terraform, Pulumi) separated infrastructure management from application deployment, and it solves the same organizational problem: the people who understand the policies aren't always the people who write the code.

Cloudflare AI Gateway provides another policy enforcement point at the API layer — rate limiting, content filtering, and access controls that apply across all agents routing through the gateway, regardless of which framework built them.

5. Monitoring and Observability

An agent that passed all guardrails at deployment can still drift into dangerous behavior over time. Models update. Data distributions shift. Tool APIs change. Monitoring isn't a launch-day activity — it's a continuous operation.

Agent observability requires more than traditional application monitoring. You need to track not just "did the agent respond" but "what did the agent decide to do, why, and what was the outcome." This means logging tool calls, reasoning chains, policy evaluations, and action results.

Arize Phoenix provides AI-specific observability with trace-level visibility into agent decisions, including latency, token usage, and evaluation metrics across agent runs. AgentOps focuses specifically on agent session monitoring, capturing the full lifecycle of agent actions in a format designed for debugging and auditing. Datadog AI Observability extends traditional infrastructure monitoring into the AI layer, connecting agent behavior to the systems those agents interact with.

The CIO article on agentic AI in engineering workflows emphasized the need for "robust guardrails, circuit breakers, and comprehensive audit trails from the ground up." Circuit breakers — automatic kill switches that trigger when agent behavior exceeds defined thresholds — are the monitoring pillar's enforcement mechanism. They turn observability data into protective action.

Mapping Tools to the Governance Stack

No single tool covers all five pillars. Here's how the current landscape maps to each layer — use this as a starting point for assembling your governance stack.

| Governance Layer | Tool | What It Covers |
|---|---|---|
| Control Plane | Galileo Agent Control | Centralized policy enforcement, pluggable evaluators, @control() decorator pattern. Apache 2.0. Integrates with CrewAI, Strands Agents, Glean, Cisco AI Defense. |
| Observability | Arize Phoenix | Trace-level agent decision analysis, drift detection, evaluation frameworks. Strong for identifying when agent behavior shifts over time. |
| Observability | AgentOps | Agent session tracking, replay, debugging. Answers "what did my agent do and why" post-incident. |
| Observability | Datadog AI Observability | Connects agent behavior to infrastructure monitoring. Best for teams already on Datadog. |
| Framework Guardrails | Griptape | Input/output validation, content filtering, structured tool use built into the agent framework. |
| Framework Guardrails | ControlFlow | Structured task graphs that constrain agent actions per step. Governance through architecture. |
| Infrastructure | Cloudflare AI Gateway | Framework-agnostic rate limiting, content policies, and access controls at the API layer. |

The practical approach: layer a control plane (Galileo) for policy enforcement, an observability tool (Arize Phoenix or AgentOps) for monitoring, and framework-level guardrails (Griptape or ControlFlow) for runtime constraints. Add infrastructure controls (Cloudflare) as a baseline safety net.

Practical Implementation: The Tiered Autonomy Model

Theory is useful. Shipping is better. Here's a concrete pattern you can implement today: the tiered autonomy model.

The idea is simple: classify every action an agent can take into one of three tiers based on risk level, then enforce different governance controls for each tier.

Green Tier — Full Autonomy

These are actions the agent can take without human approval. They're low-risk, reversible, or read-only.

Examples: Reading data, generating summaries, searching documentation, drafting content, running analysis on non-production data. Controls: Logging only. Every action is recorded for audit purposes, but no approval gate slows execution. Rate limits prevent runaway loops.

Yellow Tier — Supervised Autonomy

These are actions that carry moderate risk. The agent can initiate them, but a human reviews before execution — or the agent executes with enhanced monitoring and automatic rollback capability.

Examples: Sending external messages, modifying non-critical configurations, creating resources in staging environments, updating CRM records. Controls: Human-in-the-loop approval for high-stakes variants. Automatic rollback mechanisms for reversible actions. Enhanced logging with anomaly detection. Circuit breakers that pause the agent if action frequency exceeds thresholds.

Red Tier — Human-Only

These are actions the agent should never execute autonomously. The agent can recommend them, prepare them, and present them for approval — but a human pulls the trigger.

Examples: Deleting production data, making payments, modifying security configurations, publishing externally, changing credentials or access controls. Controls: Hard blocks in the policy layer. The agent physically cannot execute red-tier actions — the control plane rejects the tool call before it reaches the target system. Attempting a red-tier action triggers an alert to the security team.

Implementing the Tiers

With Galileo Agent Control, this maps to the @control() decorator pattern. As shown in their announcement blog post, any function becomes a governed decision point with a single decorator:

python
Simplified illustration based on Galileo's @control() pattern
@control()
async def query_database(sql: str) -> Results:
    return await db.execute(sql)

The policies that determine whether each action is allowed, flagged, or blocked live on the Agent Control server — not in the agent's code. Your compliance team defines what's green, yellow, and red based on your organization's risk framework, and updates those definitions without touching the agent's deployment.

For teams not using a dedicated control plane, the same pattern works with framework-level controls. ControlFlow lets you define task-level constraints that map to tiers. Griptape provides guardrail hooks where tier logic can be injected. Even a simple middleware layer that classifies tool calls by risk level and routes them through different approval paths implements the core pattern.

The key principle: start restrictive, loosen with evidence. Launch with most actions at yellow or red. As you build confidence through monitoring data — weeks of clean execution, no anomalies, no near-misses — selectively promote actions to lower tiers. This is safer than starting permissive and tightening after an incident.

Where to Start

AI agent governance doesn't require a six-month platform migration. Start with these concrete steps:

Build your agent inventory this week. List every agent running in your organization. If you don't know the full list, that's the first problem to solve. Use AgentOps or equivalent tooling to discover agents by monitoring API traffic. Classify actions by tier. Take your highest-risk agent and map every tool call it can make to green, yellow, or red. This exercise alone will surface governance gaps you didn't know existed. Separate policy from code. Even if you don't adopt a full control plane, move your guardrail logic out of hardcoded if-statements and into a configuration layer that non-engineers can review and update. Instrument everything. Add observability before you need it. Arize Phoenix and AgentOps both offer quick integration paths. When the 3am incident happens — and it will — you want the data to understand what went wrong. Evaluate a control plane. Galileo Agent Control is open-source and Apache 2.0 licensed. It integrates with CrewAI and other major agent frameworks. The cost of evaluation is a few hours; the cost of an ungoverned agent in production is a Saturday at 3am wondering which database table just disappeared.

The organizations that treat governance as a day-one engineering discipline — not a compliance afterthought — will be the ones that scale agent deployments with confidence. The tools exist. The frameworks are published. The only question is whether you build the controls before the incident, or after.

Sources

Galileo — "Announcing Agent Control" (March 11, 2026)
GlobeNewsWire — Galileo Releases Open-Source AI Agent Control Plane (March 11, 2026)
The New Stack — Galileo Agent Control Open Source (March 2026)
Forrester — Announcing Our Evaluation of the Agent Control Plane Market (2026)
PCMag — Meta Security Researcher's AI Agent Accidentally Deleted Her Emails (February 2026)
Fast Company — Meta Superintelligence Safety Director Lost Control of Her AI Agent (February 2026)
Business Insider — Meta AI Alignment Director Shares OpenClaw Email-Deletion Nightmare (February 2026)
Vectra AI — AI Governance Tools (Singapore Framework) (2026)
Zenity — AI Agent Governance Checklist for CISOs (2026)
Mayer Brown — Governance of Agentic AI Systems (February 2026)
CIO — How Agentic AI Will Reshape Engineering Workflows (2026)
CertMage — Agentic AI Governance Frameworks (Google Report) (2026)

Related Tools

AgentOps — Agent observability and session monitoring
Arize Phoenix — AI observability with trace-level analysis
ControlFlow — Structured AI workflow control with built-in constraints
Griptape — Agent framework with integrated guardrails
Cloudflare AI Gateway — API gateway with AI-specific policy controls
Datadog AI Observability — Production AI monitoring
CrewAI — Multi-agent orchestration with enterprise governance support

AI Agent Governance: How to Control Autonomous Agents in Production