The 4 Failure Modes of AI Agents in Production (and How to Mitigate Them)
The promise and the reality
The promise of agents is seductive: hand an LLM a goal and a handful of tools, and let it iterate in a loop until it decides it’s done. Read, reason, call a tool, observe the result, reason again. It sounds like magic.
The reality in production is less romantic. That loop is exactly where the money burns and where state corrupts. And here’s the uncomfortable part: the failures you’ll see are not model-quality problems. A bigger model doesn’t fix them. They’re control-loop problems — the same kind you’ve been solving for decades with watchdogs, circuit breakers, and state machines, only applied to a new domain.
Four failure modes recur across any agentic runtime, regardless of framework. If you build on agents, you will hit them. The good news is that each has a known, cheap mitigation. This post is the map.
Anatomy of a turn
Before the failures, the vocabulary. A turn is one cycle: an LLM invocation plus the execution of whatever tools it asks for. A task is many turns chained together until an exit condition. The agent “lives” inside that loop of turns.
Everything that can go wrong with a loop falls into three buckets: it never ends, it ends without making progress, or it ends wrong. The four failure modes are concrete variants of those three. Let’s take them one at a time.
Failure 1: the loop that never ends
Symptom. The agent calls the same tool with the same arguments over and over. Or it alternates between two actions in an A-B-A-B-A cycle. Or it touches the same resource forty times while nothing changes. The token bill climbs in a straight line and no result comes out.
Why it happens. The LLM has no reliable memory of “I already tried this and it didn’t work.” A slightly ambiguous tool result makes it retry the same thing. A state it doesn’t understand leaves it spinning. It’s not that the model is dumb — it’s that the loop has no brakes.
The mitigation. A loop detector that inspects the recent history of tool calls. Three patterns cover most cases:
// Loop signals over the recent tool-call history.
function detectLoop(history: ToolCall[]): LoopSignal | null {
// Read-only tools (read, list, status) are EXCLUDED:
// repeating them is legitimate, not a loop.
const actions = history.filter((c) => !c.tool.isReadOnly);
// 1. Identical consecutive calls (same tool + same args)
if (lastNAreIdentical(actions, 3)) {
return { kind: "identical", advice: "abort" };
}
// 2. The same target touched N times without a single state change
if (sameTargetWithoutWrite(actions, 4)) {
return { kind: "target-repetition", advice: "abort" };
}
// 3. Alternating cycle A-B-A-B-A
if (isAlternatingCycle(actions, 5)) {
return { kind: "alternating", advice: "abort" };
}
return null;
}
The detail that separates a useful loop detector from one that sabotages your agents: don’t count read-only tools. Reading a file, listing a directory, or checking a status are actions an agent legitimately repeats while it reasons. Count them as repetition and you’ll kill agents that are working fine. Count only the actions with side effects — the ones that change something.
The parameter that matters. The threshold N. Too low and you cancel valid work (an agent that reasonably retries twice). Too high and you pay thousands of tokens before cutting it off. There’s no universal number: it depends on the tool class. Start conservative, measure, and tune with real data from your own loops.
Failure 2: the turn that hangs
Symptom. A turn starts and never finishes. The LLM call hangs, a tool blocks on a network operation with no timeout, a subprocess dies and the parent keeps waiting for a response that will never come. The agent’s “slot” stays occupied forever. In a system running several agents in parallel, a handful of hung turns will eat your entire capacity.
Why it happens. LLM provider timeouts, unbounded I/O, deadlocks, a child process the OS killed for memory that nobody noticed. Anything that can block, in production, eventually blocks.
The mitigation. The classic zombie-process pattern, applied to turns: heartbeats plus a watchdog. Each turn emits a heartbeat (a timestamp) while it’s alive. A background process scans periodically, and when it finds a turn whose last heartbeat is older than the threshold, it declares it dead, releases the slot, and optionally requeues the task.
[watchdog] scanning active turns… 42 alive [watchdog] turn 9c3f stale: last heartbeat 6m 12s ago [watchdog] turn 9c3f -> marked DEAD, slot released, task requeued [watchdog] scanning active turns… 41 alive
The watchdog is the safety net, not the first line of defense. Every external call —to the LLM, to a tool, to a subprocess— must have its own explicit timeout. The watchdog exists to catch what you forgot to bound. If you rely on the watchdog alone, your turns will take minutes to die instead of seconds.
Failure 3: the turn that does nothing
Symptom. The agent produces a beautiful paragraph explaining what it’s going to do, or it flatly declares “the task is complete” — but it calls zero tools. Nothing happened. State is untouched. This is especially common in coordination roles and with models that drift into conversation mode.
Why it happens. Instruct models are trained to be eloquent. If you don’t force the function call, they hand you prose. And there’s a subtler, more dangerous variant: a model that believes it’s done but hasn’t touched anything.
The mitigation. An execution gate. For turns that are supposed to act, require at least one tool call. Zero calls on a turn that was meant to execute something equals a failed turn — don’t accept the narration as a result. Genuinely conversational or coordination roles are exempt.
But the underlying lesson is bigger than a gate. It’s the single most important rule in this whole post:
An agent that marks its own work as done will lie to you. Not out of malice: it hallucinates completion the same way it hallucinates anything else. “Done” cannot be an assertion by the model; it has to be a consequence of verifiable effects — a tool that changed state, a check that passed, a test that went green. If your system allows self-promotion to “complete,” you don’t have a reliable system, you have an optimistic-report generator.
In practice this means the transition to “complete” is decided by the runtime after checking real effects, not by the agent saying so. It costs more to build. It’s the difference between a demo and production.
Failure 4: the misclassified error
Symptom. Something fails. A naive system does one of three things, all bad: it retries forever, it gives up at the first error, or it treats every error the same. But a transient network blip and a “you don’t have permission” demand opposite responses. Retrying a permission error is burning tokens; giving up on a transient 5xx is abandoning work that would have succeeded on the second try.
Why it happens. Errors are heterogeneous, and code tends to treat them as a uniform blob — a generic catch that doesn’t distinguish.
The mitigation. An explicit error taxonomy wired to a policy table. Every error type gets a row: how many retries, what backoff, and what terminal action when they run out.
| Error type | Retries | Backoff | Action on exhaustion |
|---|---|---|---|
| Transient (network, 5xx) | 3 | exponential | escalate to supervisor |
| Rate limit (429) | 5 | exponential + jitter | escalate to supervisor |
| Invalid input (4xx) | 0 | — | fix the input / abort |
| Permissions / auth | 0 | — | escalate to a human |
| Contract / logic | 1 | — | abort |
| Unknown | 2 | exponential | escalate to a human |
Two properties make this table work. First: it’s exhaustive — every error lands in some row, and unknowns get a conservative default policy (a few retries, then escalate). Second, and critical: the action on exhaustion is never “silently mark as complete.” When the attempts run out, you escalate or abort. You never fake success. It’s the same principle as Failure 3, seen from the other side.
The common thread
Once you’ve been fighting this for a while, you see that all four failures are the same story told four times: the agent is not the authority on its own progress.
- The infinite loop is the agent believing it’s making progress when it isn’t.
- The hung turn is the agent giving no sign of life and the system not noticing.
- The empty turn is the agent claiming action where there was none.
- The misclassified error is the agent (or a lazy
catch) misreading what a failure means.
In all four, the fix is fundamentally the same: the runtime —not the model— is what decides when to stop, what counts as alive, what counts as action, and what an error means. Treat the agent as a capable but unreliable worker inside a reliable harness. The harness is where production robustness lives.
Observability: see the failure before you pay for it
You can’t mitigate what you can’t see. All four failure modes are invisible until the bill or the support ticket arrives — unless you instrument the loop.
The concrete recommendation: adopt the OpenTelemetry semantic conventions for GenAI. One span per turn with gen_ai.* attributes (system, model, input and output tokens, finish reason), and child spans per tool call. On top of that, a handful of metrics: turns executed, tokens consumed, LLM latency, tool executions, retries and escalations.
With that telemetry, the failures stop being invisible:
- The infinite loop shows up as a spike in tokens per minute.
- The hung turn is a span that opens and never closes.
- The empty turn is a turn whose span has no tool-call children.
- The misclassified error is a spike of retries on the same error type.
Of all the metrics, tokens-per-minute with an alert is the cheapest insurance you’ll ever buy. A runaway loop at 3 a.m. can cost more in one night than your entire infrastructure costs in a month. A simple alert on spend wakes you up before the invoice does.
The checklist
If you take away one thing, make it this list. It’s the minimum viable kit for putting agents into production without surprises:
- Loop detector (identical / alternating / same target), exempting read-only tools.
- Heartbeat plus watchdog for hung turns, and an explicit timeout on every external call.
- Execution gate: action turns require at least one tool call; narration never counts as success.
- Error taxonomy with a policy table; the action on exhaustion is escalate or abort, never fake completion.
- Never auto-Done: completion is decided by verifiable effects, not by the agent’s word.
- OTel GenAI instrumentation plus an alert on the token rate.
FAQ
What are the main failure modes of AI agents in production?
Four recur in any agentic runtime: the loop that never ends (the agent repeats actions without progress), the turn that hangs (never finishes and holds capacity), the turn that does nothing (narrates instead of acting), and the misclassified error (every failure treated the same). None is a model-quality problem; they're control-loop failures.
Why does an AI agent get stuck in an infinite loop?
Because the LLM has no reliable memory that it already tried something without success, and an ambiguous result makes it retry the same thing. It's not a lack of intelligence — the loop just has no brakes. Mitigate it with a loop detector for identical calls, alternating cycles, or the same target repeated, exempting read-only tools.
How do you detect an agent turn that has hung?
With heartbeats and a watchdog: each turn emits a heartbeat while alive, and a background process marks as dead any turn whose last heartbeat exceeds a threshold, releasing its slot. On top of that, every external call (to the LLM, a tool, a subprocess) must have its own explicit timeout.
Does a more powerful AI model fix these failures?
No. They're control-loop failures, not intelligence failures, so a bigger model won't fix them. They're solved with well-understood engineering —loop detection, watchdogs, execution gates, error taxonomies, and observability— applied to the runtime that wraps the agent.
Conclusion
None of these four failures is fixed by a better model, because none of them is a failure of intelligence: they’re control-loop failures. And that, in fact, is good news. It means you’re not facing a new, unsolvable problem, but old and well-understood ones —watchdogs, circuit breakers, state machines, error taxonomies— reapplied to a domain that looks new but isn’t, really.
The agent brings the intelligence. You bring the harness. Production reliability doesn’t live in the prompt: it lives in the code that wraps the loop.