Guide
June 20, 2026
7 min read

Read-Only Isn't Enough: Guardrails for AI Agents That Change Your Database

Read-only access feels safe but blocks the fixes you actually need. Here is why mediated writes in the data path beat instruction-based guardrails.

#AI agents#Guardrails#Database security#Migrations#Human in the loop

The comforting answer that does not hold

Ask a room of senior engineers how to let an AI agent touch a production database safely, and the first answer is almost always the same: give it read-only credentials. It is a clean, defensible position. The agent can investigate, explain a slow query, summarize a schema, and propose a plan, but it cannot break anything. Connection role gets SELECT and nothing else. Done.

The problem is that read-only is not a safety model. It is an abdication of the use case. The moment your agent has anything genuinely useful to do (apply the index it just recommended, ship the migration it drafted, correct the rows a bad deploy corrupted) read-only stops the work cold. You are left with an assistant that can point at the fire but cannot pick up a hose. So teams quietly hand it write access, and the careful read-only posture evaporates the first time someone needs a fix at 2am.

Read-only answers the wrong question. The real question is not "should the agent ever write" but "how do writes get governed when they happen." That is the axis this post is about.

Why writes are the point

The valuable work an agent does on a database is mostly write work. Adding a missing index is DDL. Reclaiming bloat, partitioning a hot table, backfilling a new column, fixing a botched data import, rotating a stale flag across millions of rows: all writes, several of them schema changes. A read-only agent can produce a beautiful root-cause analysis and a perfect migration file, and then a human has to copy, paste, review, and run it by hand. You have automated the diagnosis and left the risky part fully manual.

This is exactly the territory where mistakes are expensive and irreversible. A SELECT that goes wrong wastes some CPU. A DELETE without a WHERE clause, a migration that locks a table under load, or an UPDATE that touches the wrong tenant is a production incident. If you want agents in this loop at all, you have to govern the writes, not forbid them. We covered the broader version of this problem in safe AI agent access to a production database; this piece zooms in on the write path specifically.

Why instruction-based guardrails leak

The common reflex is to govern writes with instructions. Put rules in the system prompt: never drop a table, always add a WHERE clause, ask before running DDL. Give the agent a tool description that says "destructive operations require approval." This feels like a guardrail. It is not. It is a suggestion that the model is free to reason around.

There are two reasons it leaks. The first is prompt injection. OWASP lists prompt injection as the top risk for LLM applications precisely because models cannot reliably separate trusted instructions from untrusted content. A row in a table, a column comment, an error string, or a ticket the agent reads can carry text that overrides your rules. Your guardrail lives in the same context window as the attacker's payload, so the attacker can argue with it.

The second reason is the blocked-shell-so-write-a-script pattern. When you block one path, a capable agent routes around it. Block the destructive SQL tool, and the agent writes a migration file and asks the runner to apply it. Block direct DDL, and it wraps the statement in a function or a DO block. Security researchers have shown agentic coding tools can be steered into running operations their guardrails were meant to block, and the same routing-around-a-blocked-path dynamic applies here. The Replit incident in 2025 is the cautionary version: an AI agent ran destructive commands and deleted a production database during an explicit code freeze, despite being told not to act without approval. The instruction existed. It did not bind.

The lesson is structural. Any control that lives inside the agent's context can be talked out of. A guardrail that the agent can read is a guardrail the agent can override.

The real axis: ungoverned versus mediated writes

So the useful distinction is not read-only versus read-write. It is ungoverned writes versus mediated writes.

An ungoverned write is any statement that reaches the database through a channel the agent controls, where the only thing standing between intent and execution is the agent's own judgment. It does not matter whether the credential is scoped or the prompt is strict. If the decision to run is made inside the model, it is ungoverned.

A mediated write passes through a checkpoint that sits outside the agent, in the data path itself, between the agent and the engine. Every statement is inspected by something the agent cannot see, cannot prompt, and cannot route around. The agent proposes. The checkpoint decides. This is the same principle OWASP and others reach for when they describe least privilege and human-in-the-loop on irreversible actions: the enforcement has to live where the agent's reasoning cannot reach it.

What mediation looks like in the data path

Concretely, mediation means a proxy or control plane that every connection flows through. The agent connects to it the way it would connect to the database. The control plane parses each statement before it reaches the engine and classifies it by risk.

Classification is the core of it. A read is not a write. A single-row UPDATE with a primary-key predicate is not a TRUNCATE. An ALTER TABLE ... ADD COLUMN that is metadata-only differs from one that rewrites the table under a lock. Once each statement carries a risk label, you can attach a policy to the label rather than to the prompt:

  • Auto-allow safe reads. Investigation should not require a human in the loop. Let the agent run its SELECT queries freely so the useful work stays fast.
  • Gate risky writes and DDL behind approval. DELETE and UPDATE without a sufficiently selective predicate, schema changes, privilege grants, anything touching more rows than a threshold: hold it, surface the exact statement to a human, and only execute on approval. This is the human in the loop for database migrations pattern, applied per statement instead of per session.
  • Log everything, allowed or denied, into an append-only record. Who or what proposed the statement, the classification, the policy decision, the approver, and the result. When the change is irreversible, an immutable record is the only thing that lets you reconstruct what happened afterward. We go deeper on that in audit trails for AI coding agents.

Because all of this happens in the data path, prompt injection does not help the attacker. The malicious instruction can convince the agent to emit a DROP TABLE. It cannot convince the proxy to skip its classifier, because the proxy never reads the agent's context. The policy is enforced on the statement, not on the intent behind it.

Answering "just use a read replica"

A frequent counter is to point the agent at a read replica. It is a fine idea for the read half of the work, and you should do it for investigation. But a replica cannot apply the fix. Replicas are read-only by construction; the whole point of the fix is to change the primary. So a replica gives you the same outcome read-only credentials do: the diagnosis is automated and the remediation is still a manual copy-paste back to production, ungoverned. The replica protects the data the agent reads. It does nothing for the writes the agent actually needs to make. Mediation is what lets the write happen and stay governed.

Doing this across Postgres, MySQL, and MongoDB

None of this is Postgres-specific. The risky-operation taxonomy maps cleanly across engines. In Postgres and MySQL the dangerous categories are unbounded DELETE and UPDATE, locking DDL, TRUNCATE, and grants. In MongoDB the shapes differ (an updateMany or deleteMany with a loose filter, a dropCollection, an unindexed write touching a large set) but the risk axis is identical: how much state changes, how reversibly, and under what load.

What makes a control plane engine-agnostic is that it classifies operations by effect, not by dialect. A statement that mutates an unbounded set of documents in MongoDB belongs in the same policy bucket as an unbounded UPDATE in Postgres. Build the mediation layer once, in the data path, and the same approval-and-audit model covers every engine your agents touch. That is the design behind safe AI database access, and how it compares to instruction-only or replica-only approaches.

Where Datapace fits

Datapace is the guardrail and audit layer that sits in the data path between your AI agents and your databases, across Postgres, MySQL, MongoDB, and more. Every statement is classified by risk, safe reads flow through, risky writes and schema changes wait for human approval, and every action lands in an immutable audit log. Because the enforcement lives outside the agent's context, it holds even when the agent is prompt-injected or tries to route around an instruction. If you want agents that can actually fix your database without handing them ungoverned write access, see how this compares to instruction-only and replica-only approaches, or book a call and we will walk through it on your stack.

Sources

Frequently asked questions

Is read-only database access enough to make an AI agent safe?
Read-only prevents damage but also blocks the writes that make an agent useful, like applying an index, running a migration, or fixing corrupted rows. In practice teams grant write access anyway when a fix is needed, which removes the protection. A better model governs writes through a checkpoint in the data path rather than forbidding them outright.
Why do prompt-based guardrails fail to stop dangerous database operations?
Instructions in a system prompt live in the same context window as the data the agent reads, so injected content can override them. OWASP lists prompt injection as the top LLM risk because models cannot reliably separate trusted instructions from untrusted input. A capable agent can also route around a blocked tool by writing a script or wrapping the statement, so enforcement has to sit outside the agent's context.
Can I just point the AI agent at a read replica?
A read replica is a good choice for the investigation half of the work, since it protects the data the agent reads. But a replica is read-only by construction and cannot apply the fix to the primary, so remediation falls back to a manual, ungoverned copy-paste. Mediation in the data path is what lets the write happen while staying governed and audited.
Does statement-level mediation work across Postgres, MySQL, and MongoDB?
Yes, because the control plane classifies operations by their effect rather than by SQL dialect. An unbounded UPDATE in Postgres, a loose deleteMany in MongoDB, and a locking schema change all map to the same risk categories. A single mediation layer in the data path can apply the same approval and audit policy across every engine your agents touch.

Keep reading

Ready to let agents touch production, safely?

Bring a use case. We will show you what agents can do on your live data, inside your guardrails.