Reading Postgres EXPLAIN ANALYZE BUFFERS, end to end
Postgres has three concentric levels of EXPLAIN: the plan, the actual execution, and the real I/O cost. Most developers learn the first and miss the other two. Here is all three, on real output.
Postgres gives you three levels of information about a query. They are concentric: each one adds to the previous without replacing it. Most developers learn the first level, use it for everything, and never turn on the other two. That is a mistake. EXPLAIN tells you what Postgres plans to do. EXPLAIN ANALYZE tells you what happened when it actually ran. EXPLAIN ANALYZE BUFFERS tells you how much I/O the run cost. Each level answers a different kind of question, and the gap between them is where most performance bugs live. This post walks the three levels end to end, with real output, on a real schema.
Each ring contains the previous one. BUFFERS does not replace ANALYZE. ANALYZE does not replace EXPLAIN.
Level 1: EXPLAIN, the cheap peek
Plain EXPLAIN does not execute the query. It asks the planner to choose a plan and prints the tree, along with the planner's row estimates and cost estimates. The cost is in an arbitrary unit calibrated to sequential-page I/O and is not directly comparable to milliseconds.
Use EXPLAIN when you want to know whether a query will hit an index, whether the join order is sane, or whether the planner is about to do something obviously wrong on a table you cannot execute against in production. It is fast, safe, and free of side effects.
EXPLAIN SELECT * FROM orders WHERE customer_id = 1234;
QUERY PLAN
-------------------------------------------------------------
Index Scan using orders_customer_idx on orders
Index Cond: (customer_id = 1234)
(cost=0.43..12.52 rows=4 width=132)
Two things to read off this output. The plan node is an Index Scan, which means Postgres will use orders_customer_idx and should be fast. The estimate is four rows. If you expected four thousand rows, the statistics on this table are stale and someone needs to run ANALYZE. Bad estimates cause bad plans, especially for joins.
Level 2: EXPLAIN ANALYZE, the actual run
Adding ANALYZE runs the query for real and reports actual row counts and actual time per plan node, alongside the estimates. It is the right tool when a query is slow and you want to know why.
Two important consequences of running it: the query has side effects if it is a write, and it takes as long as the query takes. Wrap writes in a transaction and roll back. For slow queries, be patient.
EXPLAIN ANALYZE SELECT * FROM orders WHERE total > 500;
QUERY PLAN
----------------------------------------------------------------------
Seq Scan on orders
Filter: (total > 500)
Rows Removed by Filter: 99,400
(cost=0.00..2,451.00 rows=600 width=132)
(actual time=0.012..18.450 rows=600 loops=1)
Planning Time: 0.118 ms
Execution Time: 18.502 ms
This is a Seq Scan. Postgres read every row in orders, kept the 600 that matched, discarded the 99,400 that did not. On a small table that is fine. On a table with fifty million rows, it is not. The fix is an index on total. After creating one, the same query returns:
EXPLAIN ANALYZE SELECT * FROM orders WHERE total > 500;
QUERY PLAN
----------------------------------------------------------------------
Index Scan using orders_total_idx on orders
Index Cond: (total > 500)
(cost=0.29..86.40 rows=600 width=132)
(actual time=0.047..0.612 rows=600 loops=1)
Planning Time: 0.091 ms
Execution Time: 0.698 ms
The plan node changed from Seq Scan to Index Scan. Execution time dropped from 18 ms to 0.7 ms. Same data, same result, 26x faster because Postgres now has a cheap way to find the matching rows.
Three signals to read for every EXPLAIN ANALYZE:
- Estimated rows vs. actual rows. If they are off by more than ten, the planner is flying blind and a better estimate from
ANALYZE <table>(the maintenance command, not the EXPLAIN option) will change the plan. - The time spent in each node, bottom-up. A long time in a specific node tells you where the cost actually is.
- Unusual nodes.
Materialize,Hash,Sort, andNested Loopon large inputs are worth a second look.
Level 3: EXPLAIN ANALYZE BUFFERS, the real cost
Execution time is a symptom. The cause of slow execution is usually I/O. Postgres stores tables and indexes in 8 KB pages called buffers. A buffer hit means Postgres found the page in its shared memory cache. A buffer read means Postgres had to go to disk. The difference between those two is multiple orders of magnitude on any real system.
Turn BUFFERS on with EXPLAIN (ANALYZE, BUFFERS) and Postgres adds a line to each plan node telling you exactly how many buffers were hit and how many were read from disk.
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE total > 500;
QUERY PLAN
----------------------------------------------------------------------
Index Scan using orders_total_idx on orders
Index Cond: (total > 500)
(cost=0.29..86.40 rows=600 width=132)
(actual time=0.047..0.612 rows=600 loops=1)
Buffers: shared hit=72
Planning Time: 0.091 ms
Execution Time: 0.698 ms
Seventy-two buffers hit, zero reads. The whole query worked from cache. That is why it was fast.
Now look at a cold version of the same query, run right after a restart before the cache is warm:
Index Scan using orders_total_idx on orders
...
Buffers: shared hit=4 read=68
...
Execution Time: 8.200 ms
Same plan, same rows, same mean cost estimate. The real-world execution went from 0.7 ms to 8.2 ms purely because the data had to come from disk instead of cache. BUFFERS is the only way to see this difference. Without it, two runs of the same query look unexplainedly different.
Read BUFFERS output like this:
shared hit: pages that were in memory. Cheap.shared read: pages that came from disk. Expensive.shared dirtied: pages modified by the query, waiting to be written back.shared written: pages actually flushed to disk during the query.
A query that reports shared read=40,000 is reading 320 MB from disk. That is the query that actually costs you on a cloud database, because cloud storage I/O is the fee that scales, not CPU. A query that reports shared hit=40,000 read=0 touched the same volume of data but paid for it entirely in memory bandwidth.
When this becomes a workflow problem
EXPLAIN ANALYZE BUFFERS is the right tool for one query. It is the wrong tool for the hundred queries your application runs per request. For that, the signal moves from "analyze this query" to "which of the thousands of queries in flight deserves to be analyzed in the first place."
That is a different question, with a different answer: pg_stat_statements. pg_stat_statements tells you which queries are expensive in aggregate, which are called most often, and which have grown since a deploy. Only once you know which query to look at does EXPLAIN ANALYZE BUFFERS become useful. Running EXPLAIN on every query your app emits is not a workflow. Picking the right one and running EXPLAIN ANALYZE BUFFERS on that one is.
The honest framing: EXPLAIN is for diagnosis. pg_stat_statements is for triage. Do the triage first, then the diagnosis.
Notes on the output format
Postgres supports FORMAT TEXT, FORMAT JSON, FORMAT YAML, and FORMAT XML. Text is the default and the most compact. JSON is the right format when feeding the output into a tool. A query like:
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT ...;
produces a structured document that can be diffed, stored, or visualized. The website explain.depesz.com takes any text-format output and turns it into a highlighted tree that makes node timing easy to spot. For team review of a hot query plan, paste the JSON there and share the link.
Closing note
Most performance problems come from a wrong plan on one query. EXPLAIN shows you the plan. EXPLAIN ANALYZE shows you whether the plan behaved. EXPLAIN ANALYZE BUFFERS shows you the real cost that behavior paid. If a query is slow, the first command to run is EXPLAIN (ANALYZE, BUFFERS), not EXPLAIN. The extra information is free once you are already paying the execution cost of the query.
Frequently asked questions
Does EXPLAIN run my query?
Plain EXPLAIN does not. It asks the planner to produce a plan and returns the plan tree with estimates. EXPLAIN ANALYZE does run the query. If the query is a write (INSERT, UPDATE, DELETE), wrap it in a BEGIN; ... ROLLBACK; block to avoid committing the side effects.
What is a reasonable planner-estimate accuracy?
Within a factor of 2x is fine. Within 10x is suspect. Beyond 10x, the planner will often choose a wrong join shape. When estimates are badly off, run ANALYZE <tablename> (the maintenance command, not the EXPLAIN option). If estimates stay bad after, pg_stats for the relevant columns will tell you whether the distribution is too skewed for the default statistics target, and raising default_statistics_target may help.
Why do I get different timing each time I run EXPLAIN ANALYZE?
The buffer cache. The first run reads pages from disk, which is slow. Subsequent runs find the pages already in memory, which is fast. Use EXPLAIN (ANALYZE, BUFFERS) to see this directly in the shared hit versus shared read numbers. To force a cold run on the one query you care about, restart Postgres (not recommended on a live system) or use pg_prewarm to warm a specific relation deliberately.
Is there a way to get EXPLAIN output for a query that is already running?
Not directly, but auto_explain can log the plan of any query that exceeds a time threshold. Add auto_explain to shared_preload_libraries, set auto_explain.log_min_duration = 1000 to log plans of queries over one second, and optionally auto_explain.log_analyze = true and auto_explain.log_buffers = true to get the full output in the server log. The cost is the usual cost of ANALYZE, which runs the query for real.
Sources
- PostgreSQL documentation, EXPLAIN
- PostgreSQL documentation, Using EXPLAIN
- PostgreSQL documentation, auto_explain
- explain.depesz.com, plan visualizer
Want to optimize your database performance?
Get AI-powered recommendations for your specific database setup.