Finding the queries that actually hurt with pg_stat_statements
pg_stat_statements is the only unbiased view of which queries Postgres actually spends time on. Two queries cover most use. Here is setup, the two queries, and how to read the output.
If you are debugging Postgres performance by running EXPLAIN ANALYZE on queries your team suspects are slow, you are diagnosing the wrong set. The right set is whatever pg_stat_statements tells you is expensive in aggregate across all the queries your application actually runs. That set does not overlap cleanly with engineering intuition, and it is the only unbiased source of truth for what Postgres is actually spending time on.
What pg_stat_statements does
pg_stat_statements is a standard extension, distributed with Postgres, that maintains a running tally for every normalized query executed on the server. Normalization strips literals: SELECT * FROM users WHERE id = 123 and SELECT * FROM users WHERE id = 456 are the same statement, counted together. For each distinct statement, the extension records:
calls: total number of executions since last reset.total_exec_time: cumulative execution time across all calls.mean_exec_time,min_exec_time,max_exec_time,stddev_exec_time: distribution of execution time.rows: total rows returned across all calls.shared_blks_hit,shared_blks_read: buffer cache hits vs. disk reads.
From those columns, almost every interesting performance question has an answer. The shape of the question determines which column to sort on.
Setup
On Supabase
Enabled by default on every project since early 2024. Nothing to install, nothing to configure. Verify with SELECT * FROM pg_available_extensions WHERE name = 'pg_stat_statements';. The extension lives in the extensions schema; queries below reference it as pg_stat_statements because it is on the default search path.
On self-hosted Postgres
Two steps. First, add the library to shared_preload_libraries in postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
Restart Postgres for the change to take effect. Then create the extension in the database you want to measure:
CREATE EXTENSION pg_stat_statements;
The extension is per-database. If you have multiple databases, create it in each.
On other managed providers
RDS, Cloud SQL, Neon, Azure Database for Postgres: all support pg_stat_statements and most enable it by default or behind a one-click toggle in the console. Check the provider's parameter group for shared_preload_libraries.
The two queries you will actually run
There are many ways to slice pg_stat_statements. Two cover 90 percent of real-world use. Keep them bookmarked.
Total Pain
total_exec_timeFrustration Finder
mean_exec_timeTotal Pain, the query
SELECT
substring(query, 1, 100) AS query,
calls,
round(total_exec_time::numeric, 1) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
The top row is almost always one of: an analytics-style aggregate run by a dashboard, a background job that churns through a large set, or a hot read path like a user session lookup. The top row does not tell you what is wrong. It tells you where your database spends its time. The next question is why.
Frustration Finder, the query
SELECT
substring(query, 1, 100) AS query,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
rows / GREATEST(calls, 1) AS mean_rows
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;
Filtering on calls > 100 removes one-off queries that ran once during a migration and took a second. The useful result is a query that runs many times and still averages seconds per call. That is the signature of a missing index, a bad plan, or a query waiting on locks it shouldn't need.
Reading the output
A real pg_stat_statements result looks like this (truncated):
query | calls | total_ms | mean_ms | rows
------------------------------------+---------+-----------+---------+--------
SELECT ... FROM orders JOIN cust.. | 184,382 | 1,284,110 | 6.96 | 1.0
SELECT * FROM pg_stat_statements | 218 | 172,040 | 789.17 | 215.0
UPDATE sessions SET last_seen_at.. | 92,110 | 112,220 | 1.22 | 1.0
SELECT * FROM products WHERE sto.. | 1,204 | 98,400 | 81.72 | 502.0
SELECT name FROM users WHERE id.. |2,147,883 | 66,040 | 0.03 | 1.0
Four patterns to read from a table like this.
Row 1 is the biggest total cost, a join query running hundreds of thousands of times a day. The mean is 7 ms, which is fine per call. The fix, if one is needed, is to reduce how often the endpoint that calls it runs, or to cache.
Row 2 is the extension itself. Your diagnostic queries show up in their own output. Ignore.
Row 4 is an 82 ms query running 1,204 times. The total is small but the mean is bad. This is where EXPLAIN ANALYZE earns its keep. An 82 ms query on 500 rows usually means a missing index or a filter that defeats the indexes that exist.
Row 5 is the N+1 signature. Two million calls, 0.03 ms each. Individually invisible. Collectively, the fourth-largest total time. The fix is a join in the calling code, not an index.
Resetting counters
pg_stat_statements_reset() zeroes all the counts. Useful as a before-and-after around a deploy: reset, let an hour of production traffic run, snapshot, and you have exactly what the new version of the application does. Requires pg_read_all_stats or superuser on Postgres 14 or later.
SELECT pg_stat_statements_reset();
The attribution problem
pg_stat_statements tells you which queries are expensive. It does not tell you which deploy introduced the expense. A slow query that was fine last week and pages you today is only recognizable as a regression if you have a baseline from before the change. Without a before-and-after snapshot tied to a Git SHA, you are guessing.
Two approaches to solve this. One is manual: reset the extension at deploy time, capture a snapshot one hour later, and diff. This works for a small team with a slow deploy cadence. The other is to run an agent that watches the counters continuously, takes snapshots at deploy boundaries it detects from the database itself, and attributes changes to the surrounding commit window. That is the category a tool like Datapace fits into, and the specific thing it does today is surface call-count and total-time deltas with a commit range, not interpret the query's purpose. The query is still yours to read.
Whichever approach you use, the important part is that the baseline exists. Without one, pg_stat_statements is a snapshot of the present, not a comparison across time.
Closing note
The argument for pg_stat_statements is not that it is a sophisticated tool. It is not. It is a set of counters, incremented every time a query runs, totaled and averaged when you ask. The reason it matters is that it is the only view of your database that is not biased by which query someone decided to inspect. Start there. Pick the top row. Then run EXPLAIN ANALYZE BUFFERS on it.
Frequently asked questions
How much overhead does pg_stat_statements add?
Small. The extension increments counters on every query execution; published benchmarks from the Postgres developers put the overhead under 2 percent on most workloads, and it scales with query count rather than query complexity. Leave it on in production.
Does it capture queries that fail?
By default, no. Only successful executions are counted. Set pg_stat_statements.track = 'all' to include nested statements (queries executed inside functions), but failed queries are still excluded.
Why do I see truncated queries?
The default track_activity_query_size is 1024 bytes. Long queries are truncated at that length before hashing, which means long queries with different ORDER BY clauses after the 1024-byte mark may hash to the same entry. Raise the setting (requires restart) if you have a lot of long queries.
What is pg_stat_statements.max and why does it matter?
The maximum number of distinct query texts the extension retains. Default is 5000. When the table fills, least-used entries are evicted. On a large application with dynamically generated SQL, raise this to 10,000 or more. Each entry costs a fixed amount of shared memory; the trade-off is small.
Sources
- PostgreSQL documentation, pg_stat_statements
- Supabase documentation, pg_stat_statements
Want to optimize your database performance?
Get AI-powered recommendations for your specific database setup.