40 copy-paste SQL queries covering every PostgreSQL bottleneck — plus a 0–100 health score, a shareable HTML report, and optional AI analysis. Run them in psql, the CLI, or Docker. No agents. No install. Just SQL.
PostgreSQL keeps extraordinarily detailed telemetry in its system catalog. These nine signals predict 95% of production incidents — before they become incidents.
pg_stat_statementsCREATE CONCURRENTLY — are invisible to the planner and invisible to the eye.pg_stat_user_indexes, pg_indexpg_stat_user_tables, pg_classmax_connections every new connection attempt fails with a hard error. Unlike most resources, connection capacity is binary: at 99% you're fine; at 101% the application is down. Sessions stuck in idle in transaction silently hold locks and eat capacity without doing any work.pg_stat_activityALTER TABLE, a long transaction, a forgotten cursor — can block every subsequent write to that table. PostgreSQL queues them up silently. In minutes you can go from one stuck session to 50 waiting sessions and a fully unresponsive application.pg_locks, pg_blocking_pids()pg_stat_replication, pg_replication_slotsage(datfrozenxid) in pg_databaseshared_buffers cache is the fastest path to data. A cache hit costs microseconds; a cache miss hits disk and costs milliseconds. A healthy OLTP database should be above 95–99%. A sustained drop signals that your working set has grown past your buffer pool, or a runaway query is evicting hot pages.pg_stat_database blks_hit / blks_readbuffers_backend) bypass the checkpoint process entirely and indicate a severely undersized buffer pool.pg_stat_checkpointer (PG17+), pg_stat_bgwriterChoose a common symptom below to view a step-by-step diagnostic path with copy-pasteable SQL. Powered by pgvitals diagnostics.
Response times jumped; users reporting slowness...
8 diagnostic areas, 40 sections, each with a clear threshold and action step so you always know what to do next.
Pick the workflow that fits — a production incident, a routine health check, a load test, a shareable report, or AI-assisted triage.
--explain for a prioritized, plain-English fix list from Claude.Capture a point-in-time snapshot of all metrics with one function call. Compare baseline vs peak. Find what changed.
schema.sql and capture_snapshot.sql to create the monitoring schema and function.capture_snapshot('baseline') before the test, 'peak' at max load, 'cooldown' after.trend_queries.sql to see connection growth, lock wait spikes, dead tuple accumulation, cache hit ratio trends.pgvitals includes a Python-based diagnostic runner that connects to any live database, runs all 40 sections, and generates a Markdown or self-contained HTML report — with an optional AI analysis that prioritizes the fixes.
pgvitals CLI with pip. Downloads the SQL library once with init, then run full diagnostics against any database — no config files needed for a quick check.pgvitals.conf → named connection profiles → environment variables (like PGPASSWORD) → command-line flags.| # | Section | Status | Findings |
|---|---|---|---|
| 01 | Slow Queries | Data | 5 rows returned |
| 06 | Unused Indexes | Data | 36 rows returned |
| 08 | Invalid Indexes | Clear | No issues |
| 15 | Autovacuum Activity | Error | Permission Denied |
rnc_reference_map.
40 sections. Every bottleneck covered. Copy, paste, act.