New: how one CI check would have caught both of Railway's billion-row Postgres migration outages. Read the post →
Essay
April 19, 2026
12 min read

From Eraser to PR-time: the robust automated fix problem

Eraser (VLDB 2023) named it. RIB and Hybrid Cost Modeling extended it. The robust-fix problem is one question asked at different layers of the DB stack. Datapace asks it at the PR surface.

#PostgreSQL#Research#Learned optimizer#Database reliability#Automated tuning

The 2023 VLDB paper Eraser, by Weng, Zhu, Wu, Ding, Zheng, and Zhou, names a problem the learned-query-optimizer community had been circling for years without a clean formalization: how do you promote an automated database change without introducing a regression worse than what the change was supposed to fix. Eraser's answer, specific to learned optimizers, is a plugin that falls back to the baseline query optimizer when the learned plan would regress. The more interesting observation is that the same question applies to every learning-based DB component published since. Three more papers in 2024 and 2025 tackle variants of it. Datapace operates on the same problem, one layer earlier.

TL;DR. The robust-fix problem is: a learned or automated system proposes a database change; some fraction of its proposals are regressions; how do you filter the regressions before they ship. Eraser (VLDB 2023) does this at query-plan time via fallback to the baseline optimizer. Hybrid Cost Modeling (TKDE 2025) and RIB (PACMMOD 2025) do it for index tuning with noise-aware cost estimates. Uncertainty-quantification work extends it with explicit uncertainty bounds. Datapace's PR-time simulation applies the same logic at the migration surface, before the fix reaches production at all.

The problem, stated precisely

A system produces candidate database changes: a new plan chosen by a learned optimizer, an index recommended by a tuning advisor, a knob setting chosen by a tuning loop, a migration drafted by a coding agent. Each candidate has a predicted benefit (faster queries, lower cost) and a possible regression (slower queries, broken workload). The benefit and the regression are both probabilistic: the same candidate, applied to slightly different production state, may land on either side of the line.

The robust-fix problem is to build a promotion gate that keeps the benefits and rejects the regressions. The gate has to be calibrated against the cost asymmetry: a 10-percent speedup delivered 80 percent of the time, with a 2x regression 20 percent of the time, is a losing trade even though the expected value looks positive, because the 2x regression is what people notice and remember.

Timeline of the robust-fix research lineage from 2023 to 2026. Eraser VLDB 2023: plugin on top of learned query optimizers that falls back to the baseline when the learned plan would regress. Robust Query Processing essay by Haritsa: geometric framing for robustness in query processing. Hybrid Cost Modeling TKDE 2025 by Wu, Wang, Narasayya, Chaudhuri: blends real cost data with model estimates to reduce regression in index tuning under limited training data. RIB PACMMOD 2025: probabilistic quantile regression plus GNN encoder for noise-robust index benefit estimation. Uncertainty Quantification arxiv 2024 by Yu et al: explicit uncertainty bounds as a guardrail on learning-based tuning. Datapace PR-time simulation 2026: same problem one layer earlier, at the migration surface, before the fix reaches production at all.

Each paper narrows the same question at a different layer of the stack.

Eraser: the plugin model

Eraser targets learned query optimizers specifically. Systems like Lero, HyperQo, and PerfGuard choose query plans using learned cost estimates; sometimes the learned plan beats Postgres's default optimizer, sometimes it regresses. Eraser sits on top and decides, for each incoming query, whether to use the learned plan or fall back to Postgres. The fallback is the safety net. The paper reports that Eraser "eliminates performance regressions while still attaining considerable overall performance improvement," which is the closest thing to a statement of the robust-fix problem in the database literature.

Two architectural choices matter for the generalization. First, the baseline is always available as a comparator. Eraser does not decide "is this plan good in absolute terms"; it decides "is this plan at least as good as what Postgres would have done." Second, the decision is per-query, not global. A learned optimizer can be net-positive on the workload while regressing on some fraction of queries, and Eraser carves out the regressions without losing the gains.

The same structure shows up in every subsequent paper in this lineage. A learning-based component proposes; a guardrail compares against a safe baseline; the promotion gate lets the proposal through only when the comparison clears.

Hybrid Cost Modeling: noise-aware promotion for index tuning

The 2025 TKDE paper by Wu, Wang, Narasayya, and Chaudhuri (Microsoft Research) applies the same problem shape to autonomous index tuning. The setup: cloud index tuners rely on the query optimizer's cost estimates to pick indexes, these estimates can be wrong, wrong estimates produce regression-inducing recommendations. Existing work trains ML models to improve estimates, but "training data collection is typically an expensive process, especially for index tuning due to the significant overhead of creating/dropping indexes."

The hybrid-cost-modeling contribution: blend the optimizer's analytical cost estimate with a learned estimate, weighted by the amount of training data available for the query in question. When training data is plentiful, lean on the learned estimate. When training data is scarce, fall back to the analytical one. The blending is the promotion gate, calibrated against the noise in the signal.

The structural parallel to Eraser is direct. A learned component proposes (the learned cost estimate). A safe baseline is available (the optimizer's analytical estimate). The gate decides how much weight to give each based on how much confidence is warranted. What differs is the axis: Eraser guards against plan regressions, Hybrid Cost Modeling guards against advisor regressions.

RIB: explicit distributions instead of point estimates

The 2025 PACMMOD paper on RIB (Robust Learning-based Index Benefit Estimation) goes a step further. Rather than improving a single cost estimate, it replaces point estimates with full distributions. The architecture has two novel pieces: a bidirectional GNN encoder that captures structural changes in query plans before and after index use, and a fully parameterized quantile regression predictor that produces an entire distribution of likely benefits instead of a scalar.

The paper names two noise sources the distribution handles directly. Epistemic noise comes from the model's limited view of the plan; the GNN encoder attacks that. Aleatoric noise comes from the inherent variability of production execution; the quantile regression attacks that. The promotion gate now has a confidence interval to work with, not a point estimate, and can reject proposals whose lower-confidence tail includes a regression.

This is structurally the same move that happened in ML classification decades ago when calibrated probabilities replaced hard predictions. In the database tuning context, it is new: most cost models still emit scalars, and most advisors still recommend without an explicit confidence. RIB's contribution is showing that the distributional framing actually reduces regression rates on realistic benchmarks.

Robust Query Processing: the conceptual backbone

Jayant Haritsa's long-running work on Robust Query Processing (RQP), summarized in a SIGMOD Blog essay and elaborated across a decade of papers, provides the theoretical framing for why the above problems are hard and how to think about them. The central idea is geometric: a query plan chosen at compile time is optimal only if the cardinality estimates it was chosen against turn out to be correct; the space of realized costs as a function of actual cardinalities is a surface, and robust plan selection is about finding a plan that is not minimum at the estimated point but is close to minimum across a neighborhood of possible points.

The framing carries past query plans. Any learning-based database component is, in effect, choosing a configuration whose optimality is a function of inputs that may not match estimates. The robust framing asks: among the candidate configurations, which one has the best worst-case behavior across a reasonable neighborhood of input conditions. That is exactly the question the promotion-gate layer in Eraser, Hybrid Cost Modeling, and RIB are each answering with their specific mechanism.

Uncertainty quantification as an orthogonal layer

The 2024 arXiv work by Yu et al., "Can Uncertainty Quantification Enable Better Learning-based Index Tuning," is an explicit proposal to bolt uncertainty bounds onto learning-based index tuners so the recommendations carry a confidence that the tuner can act on. If the learned benefit estimate for adding index idx_A is 40 percent faster with a 95-percent confidence interval that straddles zero, the tuner should not recommend it. If the interval is tight and above zero, the tuner should recommend with confidence.

This is close in spirit to RIB and complementary to Hybrid Cost Modeling. What makes it orthogonal is that it focuses on the calibration layer rather than the estimator itself. Any learned component that emits scalar outputs can be wrapped with an uncertainty quantification layer; the wrapper makes the underlying component more usable in a promotion gate regardless of how the component was trained.

What Datapace does, in this framing

The four papers above operate inside the database: optimizer-plan selection, index recommendations, cost estimates, tuning-loop advice. Datapace operates one layer earlier: on the migration or schema change the developer is about to merge, before any of the database-internal machinery gets involved. The robust-fix problem looks the same, stated in a different vocabulary.

A developer drafts a migration, optionally with AI assistance. The proposal is the migration itself. The predicted benefit is the schema change the team wanted. The possible regression is the lock cascade, the long table rewrite, the unintended plan shift. The baseline to compare against is "what happens if the migration does not ship today." Datapace reads production state (pg_locks, pg_stat_activity, table sizes, current plans), estimates the cost of the proposal against that state, and blocks the merge when the estimated cost exceeds the team's threshold.

The mechanism is different from Eraser, Hybrid Cost Modeling, and RIB. The problem shape is the same: a learning- or automation-produced proposal, a safe baseline, and a promotion gate that compares them. Our proposal and our baseline happen at the PR surface, not inside the optimizer. The research lineage says the gate exists. We move the gate.

Eraser, RIB, Hybrid Cost Modeling

Layerinside the database
Proposalplan, index, cost estimate
Baselineanalytical optimizer, prior behavior
Gateat execute or recommend time
Failure mode they addressregressive learned plan or advisor pick

Datapace PR-time simulation

Layerpre-merge, repository surface
Proposalmigration, DDL, query change
Baseline"do not merge today," current production state
Gateat PR-comment time, before merge
Failure mode addressedcascade, rewrite, plan regression caused by DDL

Closing note

The Eraser paper and its successors named a problem that was already being solved ad-hoc by every production team that used a learned optimizer in anger: the learned output is useful on average and dangerous in the tail, and the tail has to be filtered before the output reaches production. The research community has now produced four clean formulations of that filtering problem at different layers of the DB stack. Datapace is the fifth layer: the PR surface, before any of the four become relevant. The lineage is how we know the problem shape is real, and how we know the filter has to be probabilistic rather than deterministic. The contribution we add is the surface and the signal. Everything else is an extension of what the database research community has been building since 2023.

If you want this kind of PR-time verdict for your Postgres migrations, that is what we are building at Datapace.

Frequently asked questions

Is Datapace just Eraser for migrations?

Structurally similar. Eraser and Datapace both solve the robust-fix problem: accept a learning-based or automated proposal, compare against a safe baseline, filter the regressive cases. The difference is the layer. Eraser lives inside the optimizer; Datapace lives at the PR surface. The signals we consume are different, the baseline we compare against is different, and the action we take is a PR comment rather than a query-plan choice. The problem shape is the same.

Are the learned-optimizer papers irrelevant to Postgres teams that do not use learned optimizers?

The specific systems Eraser targets (Lero, HyperQo, PerfGuard) are research prototypes. The problem shape they formalize applies to any automation that proposes database changes. A team using an LLM to draft migrations, an index advisor to recommend indexes, or a tuning script to choose knobs, is running the same robust-fix problem under a different label. The mechanisms in the papers are specific; the argument shape transfers.

Can RIB-style uncertainty quantification be applied to migration proposals?

Yes, and it is the natural next step. A Datapace verdict that currently reports "block, estimated queue indefinite" could instead report "block, 95-percent confidence interval on lock duration is 4 to 40 minutes." That level of precision requires calibrated uncertainty estimates, which in turn require enough historical data from the team's own production to calibrate against. It is a calibration layer on top of the basic gate.

What is the weakest link in this lineage as of 2026?

Training data. Every paper in this list carries a caveat about data scarcity, from Hybrid Cost Modeling's "training data collection is expensive" to Hit the Gym's 93-percent-of-time-on-collection result. The 2017 self-driving DBMS vision was optimistic about how cheap behavior models would be to train, and the 2024 to 2025 papers are still working through the consequences of that being wrong. Covered in more depth in the earlier post on self-driving databases.

Is the Haritsa RQP line still active?

Yes, as a conceptual backbone. The specific techniques (plan diagrams, parametric query optimization, robust plan selection) are referenced by newer work as the theoretical basis for why robustness is hard and how to reason about it geometrically. The live research is downstream of the framing, not the framing itself.

Sources

  1. L. Weng, R. Zhu, D. Wu, B. Ding, B. Zheng, J. Zhou, "Eraser: Eliminating Performance Regression on Learned Query Optimizer", PVLDB 17(5), 2024 (VLDB 2023 accepted).
  2. W. Wu, X. Wang, V. R. Narasayya, S. Chaudhuri, "Hybrid Cost Modeling for Reducing Query Performance Regression in Index Tuning", IEEE TKDE 37(1), 2025.
  3. "RIB: Robust Learning-based Index Benefit Estimation", PACMMOD 2025.
  4. J. R. Haritsa, "Robust Query Processing: A Case for Geometric Techniques", SIGMOD Blog and related papers.
  5. H. Yu et al., "Can Uncertainty Quantification Enable Better Learning-based Index Tuning", arXiv 2024.
  6. W. S. Lim, L. Ma, W. Zhang, M. Butrovich, S. I. Arch, A. Pavlo, "Hit the Gym", PVLDB 17(11), 2024.
  7. Datapace blog, "Self-driving databases in 2026: what actually shipped".
  8. Datapace blog, "The 8,400x staging gap: why staging lies about migration safety".

Want to optimize your database performance?

Get AI-powered recommendations for your specific database setup.

From Eraser to PR-time: the robust automated fix problem | Datapace