PostgreSQL Diagnostic Queries

Query Behavior

#01

Slow / Expensive Queries

WhatTop queries ranked by total CPU time consumed since last stats reset.

Look formean_exec_ms > 100 | pct_total_time > 10%

ActionEXPLAIN ANALYZE the top offenders; add indexes or rewrite query logic.

Requires: pg_stat_statements

SQL▾

SELECT
    round(total_exec_time::numeric, 2)                                        AS total_exec_ms,
    calls,
    round(mean_exec_time::numeric, 2)                                         AS mean_exec_ms,
    round(stddev_exec_time::numeric, 2)                                       AS stddev_exec_ms,
    round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct_total_time,
    rows,
    round(rows::numeric / nullif(calls, 0), 2)                               AS rows_per_call,
    round(shared_blks_hit::numeric
        / nullif(shared_blks_hit + shared_blks_read, 0) * 100, 2)            AS cache_hit_pct,
    left(query, 200)                                                           AS query_snippet
FROM pg_stat_statements
WHERE calls > 10
ORDER BY total_exec_time DESC
LIMIT 25;

Query Behavior

#02

Temp File & work_mem Pressure

WhatQueries spilling intermediate results to disk because work_mem is too small.

Look forAny temp_written_mb > 0 — every MB is a disk write

ActionIncrease work_mem for the session; tune join/sort strategy.

Requires: pg_stat_statements

SQL▾

SELECT
    calls,
    round(mean_exec_time::numeric, 2)                                AS mean_exec_ms,
    temp_blks_written,
    round((temp_blks_written * 8192.0 / 1024 / 1024)::numeric, 2)  AS temp_written_mb,
    temp_blks_read,
    round((temp_blks_read * 8192.0 / 1024 / 1024)::numeric, 2)     AS temp_read_mb,
    left(query, 200)                                                  AS query_snippet
FROM pg_stat_statements
WHERE temp_blks_written > 0
ORDER BY temp_blks_written DESC
LIMIT 20;

Query Behavior

#03

Sequential Scan Hotspots

WhatTables hit mostly with full sequential scans — a missing index signal.

Look forseq_scan_pct > 50% on tables with n_live_tup > 10k

ActionAdd a targeted index on the filtered columns; verify with EXPLAIN.

SQL▾

SELECT
    schemaname,
    relname AS tablename,
    seq_scan,
    seq_tup_read,
    idx_scan,
    round(seq_scan::numeric / nullif(seq_scan + idx_scan, 0) * 100, 2)    AS seq_scan_pct,
    n_live_tup,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname)) AS total_size
FROM pg_stat_user_tables
WHERE seq_scan > 0
  AND n_live_tup > 10000
ORDER BY seq_scan DESC
LIMIT 20;

Query Behavior

#04

N+1 Patterns

WhatFast queries called thousands of times — classic ORM N+1 symptom.

Look forcalls > 10,000 and mean_exec_ms < 10

ActionBatch with IN clause; enable ORM eager loading; use prepared statements.

Requires: pg_stat_statements

SQL▾

SELECT
    calls,
    round(mean_exec_time::numeric, 4)                              AS mean_exec_ms,
    round(total_exec_time::numeric, 2)                             AS total_exec_ms,
    round(rows::numeric / nullif(calls, 0), 2)                    AS rows_per_call,
    left(query, 200)                                               AS query_snippet
FROM pg_stat_statements
WHERE calls > 10000
  AND mean_exec_time < 10
ORDER BY calls DESC
LIMIT 20;

Query Behavior

#05

JIT Compilation Overhead

WhatQueries where JIT compilation time exceeds its execution benefit.

Look fortotal_jit_ms close to or greater than mean_exec_ms

ActionSET jit = off for the session; raise jit_above_cost in postgresql.conf.

Requires: pg_stat_statements, PostgreSQL 14+

SQL▾

SELECT
    calls,
    round(mean_exec_time::numeric, 2)                                   AS mean_exec_ms,
    jit_functions,
    round(jit_generation_time::numeric, 2)                              AS jit_gen_ms,
    round(jit_inlining_time::numeric, 2)                                AS jit_inline_ms,
    round(jit_optimization_time::numeric, 2)                            AS jit_opt_ms,
    round(jit_emission_time::numeric, 2)                                AS jit_emit_ms,
    round((jit_generation_time + jit_inlining_time
           + jit_optimization_time + jit_emission_time)::numeric, 2)   AS total_jit_ms,
    left(query, 200)                                                     AS query_snippet
FROM pg_stat_statements
WHERE jit_functions > 0
ORDER BY jit_generation_time + jit_inlining_time
         + jit_optimization_time + jit_emission_time DESC
LIMIT 15;

Index Health

#06

Unused Indexes

WhatIndexes never used in a query scan since the last pg_stat_reset.

Look foridx_scan = 0 on non-primary, non-unique indexes

ActionDROP after verifying; check stats_reset date first.

SQL▾

SELECT
    s.schemaname,
    s.relname AS tablename,
    s.indexrelname AS indexname,
    pg_size_pretty(pg_relation_size(s.indexrelid)) AS index_size,
    s.idx_scan,
    s.idx_tup_read,
    s.idx_tup_fetch
FROM pg_stat_user_indexes s
JOIN pg_index i ON i.indexrelid = s.indexrelid
WHERE s.idx_scan = 0
  AND NOT i.indisprimary
  AND NOT i.indisunique
  AND pg_relation_size(s.indexrelid) > 0
ORDER BY pg_relation_size(s.indexrelid) DESC;

-- Stats reset time (judge staleness of idx_scan = 0)
SELECT stats_reset FROM pg_stat_database WHERE datname = current_database();

Index Health

#07

Duplicate / Redundant Indexes

WhatMultiple indexes covering the exact same column set.

Look forAny row — duplicates add write overhead with no query benefit.

ActionKeep the most specific one; DROP the rest.

SQL▾

SELECT
    indrelid::regclass                                                      AS table_name,
    array_agg(indexrelid::regclass ORDER BY indexrelid)                    AS duplicate_indexes,
    array_agg(
        pg_size_pretty(pg_relation_size(indexrelid)) ORDER BY indexrelid
    )                                                                       AS sizes,
    indkey::text                                                            AS index_columns
FROM pg_index
GROUP BY indrelid, indkey
HAVING count(*) > 1
ORDER BY indrelid::regclass::text;

Index Health

#08

Invalid Indexes

WhatIndexes left in invalid state — typically from a failed CREATE INDEX CONCURRENTLY.

Look forAny row — invalid indexes waste space and are never used by the planner.

ActionDROP index_name; then recreate with CONCURRENTLY.

SQL▾

SELECT
    n.nspname                                              AS schemaname,
    c.relname                                              AS tablename,
    i.relname                                              AS indexname,
    pg_size_pretty(pg_relation_size(i.oid))               AS wasted_size
FROM pg_index x
JOIN pg_class c ON c.oid = x.indrelid
JOIN pg_class i ON i.oid = x.indexrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE NOT x.indisvalid
  AND n.nspname NOT IN ('pg_catalog', 'information_schema');

Index Health

#09

Missing Foreign Key Indexes

WhatFK columns without a supporting index — causes seq scans on joins and cascades.

Look forAny row — almost always worth indexing.

ActionCREATE INDEX ON table(fk_column);

SQL▾

SELECT
    c.conrelid::regclass                                              AS table_name,
    c.conname                                                         AS constraint_name,
    string_agg(a.attname, ', ' ORDER BY x.n)                        AS fk_columns
FROM pg_constraint c
CROSS JOIN LATERAL unnest(c.conkey) WITH ORDINALITY AS x(attnum, n)
JOIN pg_attribute a ON a.attrelid = c.conrelid AND a.attnum = x.attnum
WHERE c.contype = 'f'
  AND NOT EXISTS (
      SELECT 1 FROM pg_index i
      WHERE i.indrelid = c.conrelid
        AND (i.indkey::int2[])[0 : array_length(c.conkey, 1) - 1]
            @> c.conkey
  )
GROUP BY c.conrelid, c.conname
ORDER BY table_name;

Index Health

#10

Index Bloat

WhatIndexes with high fragmentation — free-space waste from deletes and updates.

Look forbloat_pct_estimate > 30% on large indexes

ActionREINDEX CONCURRENTLY indexname

SQL▾

WITH index_info AS (
    SELECT
        n.nspname                                              AS schemaname,
        ct.relname                                             AS tablename,
        ci.relname                                             AS indexname,
        pg_relation_size(ci.oid)                               AS index_bytes,
        ci.relpages                                            AS actual_pages,
        ceil(ci.reltuples * 14
             / (current_setting('block_size')::int * 0.8)
        )                                                      AS estimated_min_pages
    FROM pg_index x
    JOIN pg_class ci ON ci.oid = x.indexrelid
    JOIN pg_class ct ON ct.oid = x.indrelid
    JOIN pg_namespace n ON n.oid = ci.relnamespace
    WHERE n.nspname NOT IN ('pg_catalog', 'information_schema')
      AND ci.relpages > 0
)
SELECT
    schemaname,
    tablename,
    indexname,
    pg_size_pretty(index_bytes)                                            AS index_size,
    actual_pages,
    estimated_min_pages::int,
    round(
        ((1 - estimated_min_pages / nullif(actual_pages, 0)) * 100)::numeric, 2
    )                                                                      AS bloat_pct_estimate
FROM index_info
WHERE index_bytes > 1024 * 1024
  AND actual_pages > estimated_min_pages
ORDER BY index_bytes DESC
LIMIT 20;

Tables & Storage

#11

Table Bloat

WhatTables with large amounts of dead / unreclaimable space.

Look forbloat_pct_estimate > 20% | bloat size > 100 MB

ActionVACUUM ANALYZE table (online); pg_repack (no lock); or VACUUM FULL (full lock).

SQL▾

WITH constants AS (
    SELECT current_setting('block_size')::int AS bs,
           23 AS hdr, 8 AS ma
),
per_table AS (
    SELECT
        ns.nspname AS schemaname, tbl.relname AS tablename,
        tbl.relpages, tbl.reltuples, bs,
        (sum((1 - s.null_frac) * s.avg_width)::int + hdr + ma
          - CASE WHEN hdr % ma = 0 THEN ma ELSE hdr % ma END) AS row_data_width
    FROM pg_class tbl
    JOIN pg_namespace ns    ON ns.oid = tbl.relnamespace
    JOIN pg_attribute att   ON att.attrelid = tbl.oid AND att.attnum > 0 AND NOT att.attisdropped
    JOIN pg_stats s         ON s.schemaname = ns.nspname AND s.tablename = tbl.relname
                            AND s.attname = att.attname
    CROSS JOIN constants
    WHERE tbl.relkind = 'r' AND ns.nspname NOT IN ('pg_catalog','information_schema')
    GROUP BY ns.nspname, tbl.relname, tbl.relpages, tbl.reltuples, bs, hdr, ma
)
SELECT
    schemaname, tablename,
    relpages                                                           AS actual_pages,
    round(reltuples)                                                   AS est_row_count,
    pg_size_pretty((relpages * bs)::bigint)                           AS total_size,
    pg_size_pretty(
        greatest(0, relpages - ceil(reltuples * row_data_width / bs))::bigint * bs
    )                                                                  AS bloat_size_estimate,
    round(
        (greatest(0, 1 - ceil(reltuples * row_data_width / bs)
                        / nullif(relpages, 0)) * 100)::numeric, 2
    )                                                                  AS bloat_pct_estimate
FROM per_table
WHERE relpages > 10
ORDER BY greatest(0, relpages - ceil(reltuples * row_data_width / bs)) DESC
LIMIT 20;

Tables & Storage

#12

TOAST Table Bloat

WhatOversized TOAST tables storing large column values (text, jsonb, bytea, arrays).

Look fortoast_to_table_pct > 200% — TOAST larger than the main table

ActionVACUUM the parent table; consider compressing values at the application layer.

SQL▾

SELECT
    n.nspname                                                         AS schemaname,
    c.relname                                                         AS tablename,
    pg_size_pretty(pg_relation_size(c.oid))                          AS table_size,
    pg_size_pretty(pg_relation_size(t.oid))                          AS toast_size,
    round(
        pg_relation_size(t.oid)::numeric
        / nullif(pg_relation_size(c.oid), 0) * 100, 2
    )                                                                 AS toast_to_table_pct
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_class t     ON t.oid = c.reltoastrelid
WHERE c.relkind = 'r'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema')
  AND pg_relation_size(t.oid) > 1024 * 1024
ORDER BY pg_relation_size(t.oid) DESC
LIMIT 20;

Tables & Storage

#13

Table & Index Size Ranking

WhatLargest objects ranked by total, heap, index, and TOAST size.

Look forindexes_size >> heap_size (over-indexed); unexpected size growth.

ActionInvestigate large objects; review index necessity.

SQL▾

SELECT
    schemaname,
    tablename,
    pg_size_pretty(
        pg_total_relation_size(schemaname || '.' || tablename)
    )                                                                  AS total_size,
    pg_size_pretty(
        pg_relation_size(schemaname || '.' || tablename)
    )                                                                  AS heap_size,
    pg_size_pretty(
        pg_indexes_size(schemaname || '.' || tablename)
    )                                                                  AS indexes_size,
    pg_size_pretty(
        pg_total_relation_size(schemaname || '.' || tablename)
        - pg_relation_size(schemaname || '.' || tablename)
        - pg_indexes_size(schemaname || '.' || tablename)
    )                                                                  AS toast_size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
LIMIT 30;

Tables & Storage

#14

Table Access Patterns

WhatHeap vs index fetch ratio, write load, and dead tuple ratio per table.

Look forHigh seq_tup_read + low idx_tup_fetch → missing index | dead_pct > 10% → vacuum urgently needed

ActionAdd index for seq scan tables; run VACUUM ANALYZE for high dead_pct.

SQL▾

SELECT
    schemaname,
    relname AS tablename,
    seq_scan,
    seq_tup_read,
    idx_scan,
    idx_tup_fetch,
    n_tup_ins,
    n_tup_upd,
    n_tup_del,
    n_tup_hot_upd,
    n_live_tup,
    n_dead_tup,
    round(
        n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 2
    )                                                                  AS dead_pct
FROM pg_stat_user_tables
ORDER BY seq_tup_read + idx_tup_fetch DESC
LIMIT 25;

Vacuum & Stats

#15

Autovacuum Worker Activity

WhatCurrently running vacuum workers and their per-table progress.

Look forStuck workers (pct_done not advancing); index_vacuum_count = 0 on large tables.

ActionCheck autovacuum_max_workers; investigate I/O contention.

SQL▾

SELECT
    pid,
    datname,
    relid::regclass                                                    AS table_name,
    phase,
    heap_blks_total,
    heap_blks_scanned,
    heap_blks_vacuumed,
    round(
        heap_blks_vacuumed::numeric / nullif(heap_blks_total, 0) * 100, 2
    )                                                                  AS pct_done,
    index_vacuum_count,
    max_dead_tuples,
    num_dead_tuples
FROM pg_stat_progress_vacuum;

-- Active autovacuum worker count
SELECT count(*) AS active_autovacuum_workers
FROM pg_stat_activity
WHERE backend_type = 'autovacuum worker';

Vacuum & Stats

#16

Dead Tuple Urgency

WhatTables accumulating dead tuples faster than vacuum clears them.

Look fordead_pct > 10% | last_autovacuum = NULL or many days ago

ActionVACUUM ANALYZE tablename; lower autovacuum_vacuum_scale_factor.

SQL▾

SELECT
    schemaname,
    relname AS tablename,
    n_dead_tup,
    n_live_tup,
    round(
        n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 2
    )                                                                  AS dead_pct,
    n_mod_since_analyze,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    pg_size_pretty(
        pg_relation_size(relid)
    )                                                                  AS table_size
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 25;

Vacuum & Stats

#17

Stale Statistics

WhatTables with many row modifications since last ANALYZE — causes bad query plans.

Look formod_pct > 10% | time_since_analyze > 1 day on hot tables

ActionANALYZE tablename; lower autovacuum_analyze_scale_factor for busy tables.

SQL▾

SELECT
    schemaname,
    relname AS tablename,
    n_live_tup,
    n_mod_since_analyze,
    round(
        n_mod_since_analyze::numeric / nullif(n_live_tup, 0) * 100, 2
    )                                                                  AS mod_pct,
    last_analyze,
    last_autoanalyze,
    now() - greatest(last_analyze, last_autoanalyze)                  AS time_since_analyze
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY n_mod_since_analyze DESC
LIMIT 20;

Vacuum & Stats

#18

Long-Running Transactions

WhatOpen transactions blocking autovacuum and holding locks.

Look forxact_duration > 5 minutes

ActionSELECT pg_terminate_backend(pid) after investigation; set idle_in_transaction_session_timeout.

SQL▾

SELECT
    pid,
    usename,
    application_name,
    client_addr,
    state,
    wait_event_type,
    wait_event,
    now() - xact_start                                                AS xact_duration,
    now() - query_start                                               AS query_duration,
    left(query, 200)                                                   AS current_query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND now() - xact_start > interval '5 minutes'
  AND pid <> pg_backend_pid()
ORDER BY xact_start ASC;

Connections & Locks

#19

Connection Saturation

WhatCurrent connections vs max_connections — headroom remaining.

Look forused_pct > 80% — approaching the connection limit

ActionAdd PgBouncer; audit idle connections; reduce application pool size.

SQL▾

-- Summary
SELECT
    count(*)                                                               AS total,
    count(*) FILTER (WHERE state = 'active')                              AS active,
    count(*) FILTER (WHERE state = 'idle')                                AS idle,
    count(*) FILTER (WHERE state = 'idle in transaction')                 AS idle_in_txn,
    count(*) FILTER (WHERE state = 'idle in transaction (aborted)')       AS idle_in_txn_aborted,
    count(*) FILTER (WHERE wait_event IS NOT NULL AND state = 'active')   AS waiting,
    s.setting::int                                                         AS max_connections,
    round(count(*)::numeric / s.setting::int * 100, 2)                   AS used_pct,
    s.setting::int - count(*)                                              AS free_slots
FROM pg_stat_activity, pg_settings s
WHERE s.name = 'max_connections' AND pg_stat_activity.pid <> pg_backend_pid()
GROUP BY s.setting;

-- Per application breakdown
SELECT application_name, state, count(*) AS connections
FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
GROUP BY application_name, state
ORDER BY count(*) DESC LIMIT 20;

Connections & Locks

#20

Idle-in-Transaction

WhatSessions sitting idle inside an open transaction — silently hold locks and block autovacuum.

Look foridle_duration > 30 seconds

ActionFix application to commit/rollback promptly; SET idle_in_transaction_session_timeout = '30s'.

SQL▾

SELECT
    pid,
    usename,
    application_name,
    client_addr,
    now() - state_change                                              AS idle_duration,
    now() - xact_start                                                AS txn_open_duration,
    left(query, 200)                                                   AS last_query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'idle in transaction (aborted)')
ORDER BY state_change ASC;

Connections & Locks

#21

Lock Wait Tree

WhatFull chain of who is blocking whom.

Look forAny row — every lock wait degrades throughput.

ActionIdentify root blocker (where blocking_pids = '{}') and investigate or terminate.

SQL▾

-- Blocking summary
SELECT
    pid                                                                AS blocked_pid,
    usename                                                            AS blocked_user,
    pg_blocking_pids(pid)                                             AS blocking_pids,
    cardinality(pg_blocking_pids(pid))                                AS blocking_depth,
    wait_event_type,
    wait_event,
    state,
    now() - query_start                                               AS waiting_duration,
    left(query, 200)                                                   AS blocked_query
FROM pg_stat_activity
WHERE cardinality(pg_blocking_pids(pid)) > 0
ORDER BY waiting_duration DESC;

-- Detailed lock mode breakdown
SELECT l.pid, l.locktype, l.relation::regclass AS locked_object,
       l.mode, l.granted, a.usename, a.state, left(a.query, 150) AS query
FROM pg_locks l
JOIN pg_stat_activity a ON a.pid = l.pid
WHERE NOT l.granted
   OR l.pid IN (
       SELECT unnest(pg_blocking_pids(pid))
       FROM pg_stat_activity
       WHERE cardinality(pg_blocking_pids(pid)) > 0
   )
ORDER BY l.pid;

Connections & Locks

#22

Wait Events Breakdown

WhatWhat all active sessions are currently waiting on.

Look forLock/LWLock waits > a few sessions; IO:DataFileRead spikes (I/O bound).

ActionCross-reference with lock tree; investigate I/O if DataFileRead dominates.

SQL▾

SELECT
    wait_event_type,
    wait_event,
    count(*)                                                          AS sessions,
    array_agg(pid ORDER BY pid)                                      AS pids
FROM pg_stat_activity
WHERE wait_event IS NOT NULL
  AND pid <> pg_backend_pid()
GROUP BY wait_event_type, wait_event
ORDER BY sessions DESC;

Replication

#23

Streaming Replication Lag

WhatPer-standby write, flush, and replay lag.

Look forreplay_lag > 30s | flush_lag > 10s

ActionCheck standby I/O; verify network throughput; review recovery configuration.

Requires: streaming replication configured

SQL▾

SELECT
    application_name,
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn,
    write_lag,
    flush_lag,
    replay_lag,
    sync_state,
    pg_size_pretty(
        pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn)
    )                                                                 AS unsent_wal
FROM pg_stat_replication
ORDER BY replay_lag DESC NULLS LAST;

Replication

#24

Logical Replication Slot Lag

WhatWAL accumulating for logical replication consumers.

Look forconsumer_lag_size > 500 MB — risk of disk exhaustion.

ActionCheck consumer health; if consumer is gone: SELECT pg_drop_replication_slot('name').

Requires: logical replication slots configured

SQL▾

SELECT
    slot_name,
    plugin,
    database,
    active,
    active_pid,
    pg_size_pretty(
        pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
    )                                                                 AS consumer_lag_size,
    pg_wal_lsn_diff(
        pg_current_wal_lsn(), confirmed_flush_lsn
    )                                                                 AS consumer_lag_bytes
FROM pg_replication_slots
WHERE slot_type = 'logical'
ORDER BY consumer_lag_bytes DESC NULLS LAST;

Replication

#25

Replication Slot WAL Retention

WhatTotal WAL held on disk by ALL slots (streaming + logical).

Look forwal_retained approaching your pg_wal partition free space.

ActionDrop inactive slots; advance or drop lagging slots.

Requires: replication slots exist

SQL▾

SELECT
    slot_name,
    slot_type,
    active,
    pg_size_pretty(
        pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
    )                                                                 AS wal_retained,
    pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)               AS wal_retained_bytes
FROM pg_replication_slots
ORDER BY wal_retained_bytes DESC NULLS LAST;

-- Total WAL held across all slots
SELECT pg_size_pretty(
    sum(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn))
) AS total_wal_held_by_slots
FROM pg_replication_slots;

Risk Signals

#26

XID Wraparound Risk

WhatDistance from transaction ID exhaustion (hard limit: ~2 billion XIDs).

Look forxid_age > 1.5B → emergency VACUUM FREEZE; pct_used > 70% → plan maintenance.

ActionVACUUM FREEZE on oldest tables; lower autovacuum_freeze_max_age.

SQL▾

-- Database level
SELECT
    datname,
    age(datfrozenxid)                                                 AS xid_age,
    2147483647 - age(datfrozenxid)                                   AS xid_remaining,
    round(age(datfrozenxid)::numeric / 2147483647 * 100, 2)         AS pct_used
FROM pg_database
ORDER BY age(datfrozenxid) DESC;

-- Table level (top 20 oldest)
SELECT
    n.nspname AS schemaname, c.relname AS tablename,
    age(c.relfrozenxid) AS xid_age,
    round(age(c.relfrozenxid)::numeric / 2147483647 * 100, 2) AS pct_used,
    pg_size_pretty(pg_relation_size(c.oid)) AS table_size
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r' AND n.nspname NOT IN ('pg_catalog','information_schema')
ORDER BY age(c.relfrozenxid) DESC
LIMIT 20;

Risk Signals

#27

MultiXact ID Wraparound Risk

WhatDistance from MultiXact exhaustion — a separate counter used for row-level locks.

Look formxid_age > 1 billion → tables need VACUUM FREEZE

ActionVACUUM FREEZE tablename; lower autovacuum_multixact_freeze_max_age.

SQL▾

SELECT
    datname,
    mxid_age(datminmxid)                                             AS mxid_age,
    2147483647 - mxid_age(datminmxid)                               AS mxid_remaining,
    round(mxid_age(datminmxid)::numeric / 2147483647 * 100, 2)     AS pct_used
FROM pg_database
ORDER BY mxid_age(datminmxid) DESC;

Risk Signals

#28

Sequence Exhaustion Risk

WhatSequences approaching their maximum value — integer overflow causes INSERT failures.

Look forpct_used > 80% on bigint; any significant % on int (max = 2.1B).

ActionALTER SEQUENCE seq MAXVALUE new_max; or ALTER COLUMN type TO bigint.

SQL▾

SELECT
    schemaname,
    sequencename,
    data_type,
    last_value,
    min_value,
    max_value,
    increment_by,
    cycle,
    round(
        (last_value - min_value)::numeric
        / nullif(max_value - min_value, 0) * 100, 2
    )                                                                 AS pct_used,
    (max_value - last_value) / nullif(increment_by, 0)               AS values_remaining
FROM pg_sequences
WHERE NOT cycle
  AND last_value IS NOT NULL
ORDER BY pct_used DESC NULLS LAST
LIMIT 20;

Config & Health

#29

Key GUC Settings Review

WhatCritical configuration parameters and their source (default vs tuned).

Look forsource = 'default' on memory/checkpoint settings — often too conservative for production.

ActionTune in postgresql.conf; reload with SELECT pg_reload_conf().

SQL▾

SELECT
    name, setting, unit, source, short_desc
FROM pg_settings
WHERE name IN (
    'shared_buffers', 'work_mem', 'maintenance_work_mem', 'effective_cache_size',
    'checkpoint_timeout', 'checkpoint_completion_target',
    'max_wal_size', 'min_wal_size', 'wal_level', 'wal_compression',
    'autovacuum', 'autovacuum_max_workers', 'autovacuum_vacuum_cost_delay',
    'autovacuum_vacuum_scale_factor', 'autovacuum_analyze_scale_factor',
    'autovacuum_freeze_max_age',
    'max_connections', 'superuser_reserved_connections',
    'idle_in_transaction_session_timeout', 'statement_timeout',
    'max_parallel_workers_per_gather', 'max_worker_processes', 'max_parallel_workers',
    'random_page_cost', 'seq_page_cost', 'effective_io_concurrency',
    'enable_jit', 'jit_above_cost',
    'log_min_duration_statement', 'log_lock_waits',
    'deadlock_timeout', 'log_temp_files', 'lock_timeout'
)
ORDER BY
    CASE
        WHEN name LIKE '%buffer%' OR name LIKE '%mem%'     THEN 1
        WHEN name LIKE '%checkpoint%' OR name LIKE '%wal%' THEN 2
        WHEN name LIKE '%autovacuum%'                      THEN 3
        WHEN name LIKE '%connection%' OR name LIKE '%timeout%' THEN 4
        ELSE 5
    END, name;

Config & Health

#30

Buffer Cache Hit Ratio

WhatHow often reads are served from shared_buffers vs disk, per table and globally.

Look forhit_ratio_pct < 95% on hot tables

ActionIncrease shared_buffers; investigate seq scan storms evicting hot pages.

SQL▾

-- Per table
SELECT
    schemaname, relname AS tablename,
    heap_blks_read, heap_blks_hit,
    round(heap_blks_hit::numeric
        / nullif(heap_blks_read + heap_blks_hit, 0) * 100, 2)        AS hit_ratio_pct,
    idx_blks_read, idx_blks_hit,
    round(idx_blks_hit::numeric
        / nullif(idx_blks_read + idx_blks_hit, 0) * 100, 2)          AS idx_hit_ratio_pct
FROM pg_statio_user_tables
WHERE heap_blks_read + heap_blks_hit > 1000
ORDER BY heap_blks_read DESC
LIMIT 20;

-- Global database hit ratio
SELECT
    sum(blks_hit) AS total_hits,
    sum(blks_read) AS total_reads,
    round(sum(blks_hit)::numeric
        / nullif(sum(blks_hit) + sum(blks_read), 0) * 100, 2) AS global_hit_ratio_pct
FROM pg_stat_database
WHERE datname NOT IN ('template0','template1');

Config & Health

#31

Checkpoint & WAL Pressure

WhatWhether checkpoints are forced too frequently and whether backends are writing directly to disk.

Look forforced_pct > 10% → increase max_wal_size | buffers_backend_fsync > 0 → critical

ActionIncrease max_wal_size; set checkpoint_completion_target = 0.9.

SQL▾

SELECT
    checkpoints_timed,
    checkpoints_req,
    round(
        checkpoints_req::numeric
        / nullif(checkpoints_timed + checkpoints_req, 0) * 100, 2
    )                                                                 AS forced_pct,
    round((checkpoint_write_time / 1000)::numeric, 2)                AS write_time_sec,
    round((checkpoint_sync_time / 1000)::numeric, 2)                 AS sync_time_sec,
    buffers_checkpoint,
    buffers_clean,
    maxwritten_clean,
    buffers_backend,
    buffers_backend_fsync,
    buffers_alloc,
    now() - stats_reset                                               AS stats_age
FROM pg_stat_bgwriter;

Config & Health

#32

Database-Level Summary

WhatPer-database throughput, cache hit ratio, deadlocks, and temp usage at a glance.

Look forrollback_pct > 5% | deadlocks > 0 | cache_hit_pct < 95%

ActionInvestigate rollback sources; add deadlock_timeout logging; tune shared_buffers.

SQL▾

SELECT
    datname,
    numbackends                                                        AS active_backends,
    xact_commit,
    xact_rollback,
    round(
        xact_rollback::numeric / nullif(xact_commit + xact_rollback, 0) * 100, 2
    )                                                                  AS rollback_pct,
    round(
        blks_hit::numeric / nullif(blks_read + blks_hit, 0) * 100, 2
    )                                                                  AS cache_hit_pct,
    tup_inserted, tup_updated, tup_deleted,
    conflicts,
    temp_files,
    pg_size_pretty(temp_bytes)                                        AS temp_usage,
    deadlocks,
    pg_size_pretty(pg_database_size(datname))                        AS db_size,
    now() - stats_reset                                               AS stats_age
FROM pg_stat_database
WHERE datname NOT IN ('template0','template1')
ORDER BY numbackends DESC;

Config & Health

#33

WAL Generation Rate

WhatWAL (Write-Ahead Log) generation volume and rate since stats reset.

Look forHigh wal_mb_per_hour (e.g. > 1000 MB/hr) | high fpi_pct (> 20%)

ActionEnable wal_compression; tune max_wal_size and checkpoint_timeout.

Requires: PostgreSQL 14+

SQL▾

SELECT
    wal_records,
    wal_fpi,
    pg_size_pretty(wal_bytes)                                                 AS total_wal_size,
    round(wal_bytes / 1024.0 / 1024.0, 2)                                     AS total_wal_mb,
    round(
        (wal_bytes / 1024.0 / 1024.0)
        / nullif(extract(epoch from (now() - stats_reset)) / 3600.0, 0)::numeric, 2
    )                                                                          AS wal_mb_per_hour,
    round(
        wal_fpi::numeric / nullif(wal_records, 0) * 100, 2
    )                                                                          AS fpi_pct,
    stats_reset
FROM pg_stat_wal;

Tables & Storage

#34

Partitioned Table Health

WhatPartitioned tables, partition counts, and total sizes.

Look forpartition_count > 100 | partition_count = 0

ActionMerge old partitions or partition by larger range; create missing partitions.

SQL▾

SELECT
    n.nspname                                                                  AS schemaname,
    c.relname                                                                  AS table_name,
    count(i.inhrelid)                                                          AS partition_count,
    pg_size_pretty(pg_total_relation_size(c.oid))                              AS total_size,
    pg_size_pretty(pg_relation_size(c.oid))                                    AS parent_size
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_inherits i ON i.inhparent = c.oid
WHERE c.relkind = 'p'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema')
GROUP BY n.nspname, c.relname, c.oid
ORDER BY partition_count DESC;

Risk Signals

#35

Open Prepared Transactions

WhatUncommitted prepared transactions (2PC/two-phase commit).

Look forAny row older than 5 minutes (blocks vacuum, holds locks)

ActionRun COMMIT PREPARED '<gid>'; or ROLLBACK PREPARED '<gid>';.

SQL▾

SELECT
    gid,
    prepared,
    owner,
    database,
    now() - prepared                                                          AS age,
    transaction::text                                                          AS xid
FROM pg_prepared_xacts
ORDER BY prepared ASC;

Config & Health

#36

I/O Stats by Backend (pg_stat_io)

WhatI/O statistics broken down by backend type, target object, and context.

Look forHigh evictions | high temp relation reads/writes

ActionIncrease work_mem if temp relation I/O is high; increase shared_buffers if evictions are high; tune checkpointer if writes dominate backends.

Requires: PostgreSQL 16+, track_io_timing = on (optional for timings)

SQL▾

SELECT
    backend_type,
    object,
    context,
    reads,
    round(read_time::numeric, 2)                                              AS read_time_ms,
    writes,
    round(write_time::numeric, 2)                                             AS write_time_ms,
    hits,
    evictions,
    round(reads::numeric / nullif(reads + hits, 0) * 100, 2)                  AS read_pct
FROM pg_stat_io
WHERE reads + writes + hits > 0
ORDER BY reads + writes DESC;

Inventory & Extensions

#37

Extension Inventory

WhatAll installed extensions with version and upgrade availability.

Look forupgrade_available = true | extensions in public schema (security risk) | unexpected extensions

ActionRun ALTER EXTENSION <name> UPDATE; for stale versions; move security-sensitive extensions to a dedicated schema.

SQL▾

SELECT
    e.extname                                                                  AS extension,
    e.extversion                                                               AS installed_version,
    ae.default_version                                                         AS latest_version,
    e.extversion <> ae.default_version                                         AS upgrade_available,
    n.nspname                                                                  AS schema,
    e.extrelocatable                                                           AS relocatable,
    obj_description(e.oid, 'pg_extension')                                    AS description
FROM pg_extension e
JOIN pg_available_extensions ae ON ae.name = e.extname
JOIN pg_namespace n ON n.oid = e.extnamespace
ORDER BY upgrade_available DESC, e.extname;

Inventory & Extensions

#38

Foreign Data Wrappers & Foreign Tables

WhatFDW servers, user mappings, and foreign tables.

Look forStale or unconfigured user mappings | foreign tables with no active server | unexpected remote servers

ActionDROP SERVER <name> CASCADE for decommissioned remotes; verify user mapping credentials; audit foreign table ownership.

SQL▾

SELECT
    fs.srvname                                                                 AS server_name,
    fdw.fdwname                                                                AS fdw_type,
    fs.srvoptions                                                              AS server_options,
    ft.foreign_table_schema,
    ft.foreign_table_name,
    ft.foreign_server_name,
    um.umoptions                                                               AS user_mapping_options
FROM information_schema.foreign_tables ft
JOIN pg_foreign_server fs ON fs.srvname = ft.foreign_server_name
JOIN pg_foreign_data_wrapper fdw ON fdw.oid = fs.srvfdw
LEFT JOIN pg_user_mappings um
    ON um.srvname = fs.srvname
   AND um.usename = current_user
ORDER BY fs.srvname, ft.foreign_table_schema, ft.foreign_table_name;

Inventory & Extensions

#39

Function Performance

WhatExecution stats for user-defined functions and stored procedures.

Look forHigh total_ms (CPU consumers) | self_pct > 50% (inefficient body) | high calls with low mean_ms (hot-loop)

ActionProfile high-self_time functions; cache repeated lookups; consider set-returning SQL rewrites.

Requires: track_functions = 'pl' or 'all' in postgresql.conf

SQL▾

SELECT
    schemaname,
    funcname,
    calls,
    round(total_time::numeric, 2)                                             AS total_ms,
    round(self_time::numeric, 2)                                              AS self_ms,
    round((total_time / nullif(calls, 0))::numeric, 3)                        AS mean_ms,
    round((self_time / nullif(total_time, 0) * 100)::numeric, 1)              AS self_pct
FROM pg_stat_user_functions
WHERE calls > 0
ORDER BY total_time DESC
LIMIT 25;

Inventory & Extensions

#40

Schema Size Breakdown

WhatStorage consumed per schema — tables, indexes, and TOAST.

Look forSchemas growing unexpectedly | index_pct > 60% (over-indexed) | large toast_size vs table_size

ActionInvestigate largest schemas for bloat (sections 11, 12); review indexes for schemas with index_pct > 60%.

SQL▾

SELECT
    n.nspname                                                                  AS schema,
    count(c.oid)                                                               AS table_count,
    pg_size_pretty(sum(pg_total_relation_size(c.oid)))                         AS total_size,
    pg_size_pretty(sum(pg_relation_size(c.oid)))                               AS table_size,
    pg_size_pretty(sum(pg_indexes_size(c.oid)))                                AS index_size,
    pg_size_pretty(
        sum(pg_total_relation_size(c.oid))
        - sum(pg_relation_size(c.oid))
        - sum(pg_indexes_size(c.oid))
    )                                                                          AS toast_size,
    round(
        sum(pg_indexes_size(c.oid))::numeric
        / nullif(sum(pg_total_relation_size(c.oid)), 0) * 100, 1
    )                                                                          AS index_pct
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema', 'pg_toast')
GROUP BY n.nspname
ORDER BY sum(pg_total_relation_size(c.oid)) DESC;

SQL Diagnostic Sections

Slow / Expensive Queries

Temp File & work_mem Pressure

Sequential Scan Hotspots

N+1 Patterns

JIT Compilation Overhead

Unused Indexes

Duplicate / Redundant Indexes

Invalid Indexes

Missing Foreign Key Indexes

Index Bloat

Table Bloat

TOAST Table Bloat

Table & Index Size Ranking

Table Access Patterns

Autovacuum Worker Activity

Dead Tuple Urgency

Stale Statistics

Long-Running Transactions

Connection Saturation

Idle-in-Transaction

Lock Wait Tree

Wait Events Breakdown

Streaming Replication Lag

Logical Replication Slot Lag

Replication Slot WAL Retention

XID Wraparound Risk

MultiXact ID Wraparound Risk

Sequence Exhaustion Risk

Key GUC Settings Review

Buffer Cache Hit Ratio

Checkpoint & WAL Pressure

Database-Level Summary

WAL Generation Rate

Partitioned Table Health

Open Prepared Transactions

I/O Stats by Backend (pg_stat_io)

Extension Inventory

Foreign Data Wrappers & Foreign Tables

Function Performance

Schema Size Breakdown