What is the most common safe failure mistake teams make when deploying their first production agent

Assuming that a successful HTTP response from a downstream system means the action was applied correctly and only once. In practice, many enterprise systems will accept a duplicate write and return 200. The agent has no way to know this without an idempotency key and deduplication logic on both sides of the call. The fix is to design for duplication as the default assumption, not the edge case.

How long should an agent retry before giving up and escalating to a human

There is no universal number, but a practical starting point for most operational contexts is three attempts over no more than fifteen minutes for transient failures. Beyond that, the failure is likely not transient and more retries will not help. The more important design decision is what happens after the agent gives up. A dead-letter queue with a clear owner and a defined SLA for review is worth more than any retry tuning.

Can loop detection be handled entirely inside the agent's reasoning rather than in external code

No, and relying on it to do so is a meaningful risk. Language models can reason about loops in principle but they do not have reliable access to their own execution history across invocations, and their behaviour under repeated similar inputs is not guaranteed to be consistent. Loop detection needs to be implemented as deterministic code that checks a persistent state store before each action. Treat the model's reasoning as advisory on this, not authoritative.

How do you scope the blast radius of a bad agent run before it happens

The best approach is to define a maximum records-affected threshold before the run starts and build a circuit breaker that halts the agent if that threshold is exceeded in a single run cycle. In most operational deployments, an agent that touches ten times its expected volume in one run is almost certainly in a loop or operating on bad input, not doing useful work. Setting this threshold requires a conversation with your operations stakeholders about what normal looks like, which is a useful conversation to have regardless.

Safe Failure Is a Design Choice Not an Accident in Production AI Agents . Alexey Shurov

The plumbing that separates a bad agent run that fixes itself from one that corrupts three days of inventory records.

The Run That Wrote the Same Invoice Twice

A finance team in regional lending discovered their agent had processed the same disbursement instruction four times in eleven minutes. No error was logged. The agent had simply retried on a network timeout, and the downstream system accepted every call because nobody had made the write operation idempotent. The result was not a crash. It was silent, confident wrongness at four times the intended scale.

That is the failure mode that actually hurts in production. Not the dramatic exception stack that pages your on-call engineer at 2 a.m. The quiet one that runs to completion, returns a success status, and leaves corrupted state behind it. Building an agent that fails safely is entirely a design question, and most teams answer it too late, after the first real incident rather than before it.

Idempotency Is Not Optional It Is the Foundation

Every action your agent takes against an external system should be safe to repeat without changing the outcome beyond the first successful execution. This sounds obvious. It is almost never implemented fully on the first pass.

In a distribution context I worked in, agents were routing purchase orders to suppliers based on inventory signals. The agent would occasionally time out waiting for an acknowledgement from the supplier API, assume failure, and resubmit. The supplier system had no deduplication key on inbound orders. Within a week of go-live, the warehouse had received duplicate shipments on six SKUs. The fix was not in the agent logic. It was in requiring every write to carry a stable, deterministic request ID derived from the source record and the intended action, so the supplier system could recognise and discard a repeated call.

The pattern is the same whether you are writing to a ledger, updating a field service ticket, triggering a machine stop on a production line, or sending a customer notification. Generate the idempotency key before the action. Persist it. Pass it. If the downstream system does not support deduplication natively, build a thin wrapper that does. This is not glamorous engineering but it is the difference between a retry being safe and a retry being an incident.

Serialised Retries and Why Parallelism Is a Trap Under Failure

When an agent step fails, the instinct is to retry fast and in parallel to recover throughput. This is almost always wrong.

Parallel retries under partial failure create race conditions on shared state. In a manufacturing deployment where agents were updating work order status across a multi-site ERP, a transient database lock caused three concurrent agent threads to each believe they were the authoritative writer for the same work order. Each thread read stale state, computed a different next status, and wrote it. The work order ended up in a status that was not reachable by any valid workflow path. A human had to manually reconstruct the correct state from audit logs.

Serialized retries with exponential backoff and jitter are slower. They are also the only approach that is safe when your writes are not fully atomic or when your downstream systems have eventual consistency behaviour, which is most of the time in enterprise environments. Set a maximum retry count. Set a maximum elapsed time. When you exceed either, stop and route to a dead-letter queue or a human review step. Do not keep retrying indefinitely. An agent that cannot give up is more dangerous than one that fails fast.

Loop Filters Are How You Stop an Agent From Eating Itself

Agents that operate on event streams or polling queues can enter loops where the output of one step becomes the input that triggers the same step again. This is not a theoretical edge case. I have seen it happen in field operations, in finance reconciliation, and in distribution exception handling.

A field operations agent was designed to detect unresolved service tickets older than a threshold and escalate them. The escalation action updated a timestamp field on the ticket. The agent's polling query included that timestamp field in its filter logic. Every escalation pushed the ticket back into the active window. The agent escalated the same tickets in a tight loop for six hours before anyone noticed the notification volume.

The fix requires explicit loop detection at the agent level, not just at the system level. Before acting on a record, check whether this agent instance, or any recent instance, has already acted on this record in this run cycle. Maintain a short-lived action log keyed by record ID and action type. If the log shows a recent action, skip and log the skip rather than acting again. The window for this check should be longer than your longest expected run cycle, not shorter. In most operational contexts, a 24-hour deduplication window is a reasonable starting point.

Recovery Logic Is What Separates a Shrug From an Incident

Not every failed run needs human intervention. The decision about which failures are self-recoverable and which require escalation is one of the most important design choices you will make, and it needs to be explicit, not implicit.

In a lending operations deployment, agents were processing document verification steps as part of loan origination. The design team initially set up a simple binary, success or failure, with all failures routing to a human queue. Within a week the human queue was flooded with transient API timeouts that the agent could have retried safely. Reviewers were spending most of their time clearing noise rather than handling genuine exceptions.

The right model is a tiered classification of failure types. Transient infrastructure failures, network timeouts, rate limit responses, temporary service unavailability, are candidates for automatic retry with no human involvement. Validation failures where the input data is malformed or missing required fields should be routed to the data owner, not to a technical reviewer. State conflicts where the agent finds the record in an unexpected condition should be escalated to a domain expert who understands the business process. Security or permission failures should immediately halt the run and alert the team regardless of the hour.

Write this classification down before you build. Make it a first-class artifact of your agent design, as explicit as your data schema. The model powering your agent will not always give you a clean signal about which category a failure falls into, so the classification logic needs to live in deterministic code around the model, not inside the model's reasoning.

Observability Is Not Logging It Is Knowing What the Agent Actually Did

Most teams instrument their agents for performance. Latency, token counts, step durations. Very few instrument them for correctness at the action level. These are different things and the second one is the one that matters for safe failure.

For every write action an agent takes, you need a durable record that captures the record identifier, the action taken, the before state if you can capture it, the after state, the agent run ID, the timestamp, and whether the action was a first attempt or a retry. This is your audit trail and it is also your recovery tool. When something goes wrong, and it will, this log is what lets you reconstruct what happened, identify the blast radius, and reverse or correct the affected records without guessing.

In a distribution context, an agent was updating shipment priority flags based on customer tier signals. A bug in the tier classification logic caused a batch of standard shipments to be flagged as priority. Because the team had full action-level logging, they could identify every affected shipment record within minutes, produce a correction script that reversed only those specific changes, and verify the reversal was complete. Without that log, the recovery would have required a full audit of the shipment table against the source system, which would have taken days.

Build the action log first. Build the dashboards second. The dashboards are useful. The action log is essential.

The Practical Takeaway for Your Next Agent Build

Before you write a single line of agent logic, answer these five questions in writing and get agreement from your operations stakeholders.

What is the idempotency key for every write this agent performs, and does the downstream system honour it?
What is the maximum number of retries and elapsed time before this agent stops and routes to a dead-letter queue?
What is the deduplication window that prevents this agent from acting on the same record twice in one cycle?
What are the three or four failure categories this agent can encounter, and which category gets automatic retry, which gets routed to a domain expert, and which halts the run immediately?
What does a complete action log entry look like for this agent, and where is it stored in a way that survives a failed run?

If you cannot answer all five before you build, you are not ready to run this agent in production. The model at the centre of your agent is capable and it is also capable of being confidently wrong. The plumbing around it is what decides whether that wrongness is a recoverable blip or a three-day data recovery project. That plumbing is your job, not the model's.

Safe Failure Is a Design Choice Not an Accident in Production AI Agents