PostgreSQL is a powerful relational database, but getting peak performance requires understanding indexing, query optimization, and configuration tuning. This comprehensive guide covers practical techniques for optimizing your database.
Query Analysis Tools
EXPLAIN and EXPLAIN ANALYZE
-- Show query plan
EXPLAIN SELECT * FROM users WHERE age > 30;
-- Show actual execution stats
EXPLAIN ANALYZE SELECT * FROM users WHERE age > 30;
Understanding Output
Seq Scan on users (cost=0.00..25.00 rows=500 width=100)
Filter: (age > 30)
Rows: 250
- cost: Estimated cost (disk I/O and CPU)
- rows: Estimated/actual number of rows
- width: Average row width in bytes
Indexing Strategies
Single Column Index
-- Basic index
CREATE INDEX idx_users_age ON users(age);
-- For range queries
CREATE INDEX idx_users_created ON users(created_at DESC);
Composite Index
-- Index on multiple columns for common queries
CREATE INDEX idx_users_status_created
ON users(status, created_at DESC);
-- Efficient for: WHERE status = 'active' ORDER BY created_at DESC
Partial Index
-- Index only active users (saves space)
CREATE INDEX idx_active_users ON users(id)
WHERE status = 'active';
-- Reduces index size significantly
BRIN Index
-- Better for very large tables
CREATE INDEX idx_events_timestamp
ON events USING BRIN (timestamp);
-- Uses 100x less space than B-tree for time-series data
Query Optimization
Problem: N+1 Queries
-- Bad: Multiple queries
SELECT * FROM users;
-- Then for each user: SELECT * FROM posts WHERE user_id = ?
-- Good: Single query with JOIN
SELECT u.*, p.*
FROM users u
LEFT JOIN posts p ON u.id = p.user_id;
Problem: Unnecessary Columns
-- Bad: Selecting everything
SELECT * FROM large_table WHERE id = 1;
-- Good: Select only needed columns
SELECT id, name, email FROM large_table WHERE id = 1;
Problem: Missing WHERE Clause
-- Bad: Full table scan
SELECT COUNT(*) FROM orders;
-- Good: With index
SELECT COUNT(*) FROM orders WHERE created_at > NOW() - INTERVAL '30 days';
Aggregation Optimization
Using GROUP BY Efficiently
-- Bad: Unnecessary DISTINCT
SELECT DISTINCT user_id FROM orders;
-- Good: GROUP BY
SELECT user_id FROM orders GROUP BY user_id;
Materialized Views
-- Create a pre-computed view for expensive queries
CREATE MATERIALIZED VIEW daily_revenue AS
SELECT DATE(created_at), SUM(amount)
FROM orders
GROUP BY DATE(created_at);
-- Refresh periodically
REFRESH MATERIALIZED VIEW daily_revenue;
Table Design
Normalization
-- Bad: Denormalized (data redundancy)
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(100),
customer_phone VARCHAR(20),
product_name VARCHAR(100),
product_price DECIMAL(10, 2)
);
-- Good: Normalized (separate tables)
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
phone VARCHAR(20)
);
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10, 2)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT REFERENCES customers(id),
product_id INT REFERENCES products(id),
quantity INT
);
Partitioning Large Tables
-- Range partitioning by date
CREATE TABLE events (
id BIGSERIAL,
user_id INT,
event_type VARCHAR(50),
created_at TIMESTAMP
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
Configuration Tuning
Memory Settings
# postgresql.conf
# RAM available / number of connections
shared_buffers = 4GB # 25% of system RAM
# Working memory per operation
work_mem = 16MB
# Maintenance operations
maintenance_work_mem = 1GB
# Cache for OS disk blocks
effective_cache_size = 12GB # 50-75% of system RAM
Connection Pool
# Maximum concurrent connections
max_connections = 100
# Reserve connections for superuser
superuser_reserved_connections = 3
Query Planner
# Parallel query execution
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
# Random page cost (lower for SSD)
random_page_cost = 1.1 # Default is 4.0 for HDD
Monitoring
Check Index Usage
-- Find unused indexes
SELECT schemaname, tablename, indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY idx_blks_read DESC;
Monitor Query Performance
-- Slow query log
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top 10 slowest queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
Table Statistics
-- Check table size
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Vacuum and Analyze
Regular Maintenance
-- Update table statistics
ANALYZE table_name;
-- Clean up dead rows
VACUUM table_name;
-- Full vacuum (locks table)
VACUUM FULL table_name;
-- Combined
VACUUM ANALYZE table_name;
Auto-vacuum Configuration
# postgresql.conf
autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 10s
Replication & Backups
Streaming Replication
-- On primary server
CREATE PUBLICATION primary_pub FOR ALL TABLES;
-- On replica
CREATE SUBSCRIPTION primary_sub
CONNECTION 'postgresql://user@primary_host/db'
PUBLICATION primary_pub;
Backup Strategy
#!/bin/bash
# Daily backup
pg_dump -Fc -U postgres dbname > /backups/db_$(date +%Y%m%d).dump
# Restore
pg_restore -d dbname /backups/db_20260110.dump
Common Performance Issues
| Issue | Symptom | Solution |
|---|---|---|
| Missing indexes | High seq scans | Add indexes on WHERE/JOIN columns |
| Too many indexes | Slow writes | Drop unused indexes |
| Stale stats | Bad query plans | Run ANALYZE |
| Memory pressure | High swap usage | Increase shared_buffers |
| Bloated tables | Slow full scans | Run VACUUM FULL |
Conclusion
PostgreSQL performance optimization is an ongoing process. Regular monitoring, strategic indexing, and thoughtful configuration tuning will keep your database running smoothly.
Key Takeaways:
- Use EXPLAIN ANALYZE before and after changes
- Index columns used in WHERE, JOIN, and ORDER BY
- Monitor slow queries regularly
- Keep statistics up to date with ANALYZE
- Partition large tables appropriately
- Reserve at least 25% of RAM for shared_buffers
Start optimizing today