PostgreSQL Performance Tuning for High-Traffic Indian Applications
PostgreSQL Performance Tuning for High-Traffic Indian Applications
PostgreSQL is the database of choice for serious Indian startups and enterprises, and for good reason. It is reliable, feature-rich, open source, and handles complex queries that MySQL struggles with. But PostgreSQL performance tuning becomes critical once your application starts handling thousands of concurrent users, during a Diwali sale, a viral product launch, or simply steady growth that pushes your database past its default configuration.
This guide covers the practical PostgreSQL performance tuning techniques that make the difference between a database that handles 100 requests per second and one that handles 10,000. These are proven techniques used in Indian e-commerce platforms, fintech applications, and SaaS products that serve millions of users.
The First 80%: Configuration That Matters
PostgreSQL ships with conservative default settings designed to run on minimal hardware. For a production Indian application running on a 16 GB RAM, 4-core server (a common RDS configuration), here are the settings that have the biggest impact:
Memory Settings
shared_buffers: Set to 25% of total RAM. For a 16 GB server, this is 4 GB. This is PostgreSQL's main cache. Data pages read from disk are stored here for fast access.
effective_cache_size: Set to 75% of total RAM (12 GB for our example). This does not allocate memory; it tells the query planner how much cache is likely available, helping it choose better query plans.
work_mem: Start with 64 MB. This is per-operation memory for sorts and hash joins. Be careful: if you have 100 concurrent connections and each runs a query with 3 sort operations, that is 100 x 3 x 64 MB = 19.2 GB of potential memory usage. Monitor and adjust based on your actual query patterns.
maintenance_work_mem: Set to 512 MB to 1 GB. This speeds up VACUUM, CREATE INDEX, and ALTER TABLE operations. Since these run infrequently, you can be generous here.
WAL and Checkpoint Settings
wal_buffers: Set to 64 MB. The default is far too small for write-heavy workloads.
checkpoint_completion_target: Set to 0.9 (default is 0.5). This spreads checkpoint I/O over a longer period, reducing I/O spikes.
max_wal_size: Set to 2 GB to 4 GB. This reduces checkpoint frequency, which is particularly important for write-heavy Indian applications (e-commerce order processing, logging, analytics).
Connection Settings
max_connections: Keep this lower than you think, 100 to 200 for most applications. Each connection consumes approximately 10 MB of RAM. Use a connection pooler instead of increasing this value.
Indexing: The Single Biggest Performance Lever
Proper indexing is the difference between a query that takes 2 seconds and one that takes 2 milliseconds. Here are the indexing strategies that matter most for Indian applications:
Composite Indexes for Multi-Column Queries
If your application frequently queries orders by user_id and status, a composite index is far more effective than two separate indexes:
CREATE INDEX idx_orders_user_status ON orders (user_id, status);
Column order matters. Put the most selective column first. If you have 10,000 users but only 5 order statuses, user_id should come first.
Partial Indexes for Common Filters
Indian e-commerce applications often query for active or pending records. A partial index on just those records is smaller and faster:
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';
This index is a fraction of the size of a full index and is used automatically when your query includes WHERE status = 'pending'.
GIN Indexes for JSONB
If you store product attributes, user preferences, or configuration in JSONB columns (common in Indian applications that handle diverse product categories), GIN indexes make queries against JSONB data fast:
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
Index Maintenance
Indexes degrade over time as data is inserted, updated, and deleted. Schedule regular REINDEX operations during off-peak hours (typically 2-5 AM IST for Indian applications). Monitor index bloat using the pgstattuple extension.
Query Optimization Techniques
Use EXPLAIN ANALYZE Religiously
Every slow query should be analyzed with EXPLAIN ANALYZE. Look for:
- Seq Scan on large tables: This usually means a missing index.
- Nested Loop with high row counts: Consider if a hash join or merge join would be more efficient.
- Sort operations spilling to disk: Increase
work_memor add an index that provides sorted output.
Avoid SELECT *
Always specify the columns you need. This reduces I/O, memory usage, and network transfer. For wide tables with 30+ columns (common in Indian enterprise applications), the difference is dramatic.
Batch Operations
Instead of inserting or updating rows one at a time (a pattern we see constantly in Indian applications), use batch operations:
INSERT INTO orders (user_id, product_id, quantity)
VALUES
(1, 101, 2),
(2, 102, 1),
(3, 103, 5);
Batch sizes of 500-1000 rows provide the best throughput without excessive memory usage.
CTEs vs Subqueries
PostgreSQL 12+ inlines CTEs when possible, but be aware of the optimizer's choices. For complex queries, test both CTEs and subqueries with EXPLAIN ANALYZE and use whichever produces a better plan.
Connection Pooling: Non-Negotiable for Production
PostgreSQL creates a new process for each connection. With 500 simultaneous users, that is 500 PostgreSQL processes, each consuming 10+ MB of RAM. Connection pooling is not optional for high-traffic applications.
PgBouncer
PgBouncer is the standard connection pooler for PostgreSQL. Configure it in transaction mode for maximum connection reuse:
- Set
pool_sizeto 20-50 per database (significantly less than your application's connection count) - Set
max_client_connto match your application's needs (500-2000) - PgBouncer itself uses minimal resources. A single instance can handle thousands of client connections
The result: your application sees 500 connections, but PostgreSQL only manages 50. This dramatically reduces memory usage and context switching overhead.
Application-Level Pooling
In addition to PgBouncer, configure connection pooling in your application framework. For Node.js, use pg-pool with a pool size of 10-20. For Python, SQLAlchemy's connection pool should be configured similarly. These pools connect to PgBouncer, which then connects to PostgreSQL, creating a two-tier pooling system that handles traffic spikes gracefully.
Partitioning for Large Tables
Indian applications that store transactional data grow fast. An e-commerce platform processing 10,000 orders per day accumulates millions of rows within a year. Partitioning keeps queries fast:
Range Partitioning by Date
The most common and effective strategy for Indian applications:
CREATE TABLE orders (
id BIGINT GENERATED ALWAYS AS IDENTITY,
user_id BIGINT NOT NULL,
total DECIMAL(10,2) NOT NULL,
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2026_q1 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
Queries that filter by date automatically scan only the relevant partitions. Archiving old data becomes trivial. Detach the partition and move it to cold storage.
Vacuum and Autovacuum Tuning
PostgreSQL's MVCC architecture means that deleted and updated rows leave dead tuples. VACUUM reclaims this space. The default autovacuum settings are too conservative for high-write applications:
- autovacuum_vacuum_scale_factor: Reduce from 0.2 to 0.05. This triggers vacuum after 5% of the table has dead tuples instead of 20%.
- autovacuum_analyze_scale_factor: Reduce from 0.1 to 0.02. This updates table statistics more frequently, helping the query planner.
- autovacuum_max_workers: Increase from 3 to 5-8 for databases with many tables.
- autovacuum_naptime: Reduce from 60 seconds to 15-30 seconds for faster response to dead tuple accumulation.
For tables with extremely high write rates (logging tables, analytics events), consider setting per-table autovacuum parameters that are more aggressive than the global defaults.
Monitoring PostgreSQL Performance
You cannot tune what you do not measure. Essential monitoring for production PostgreSQL:
- pg_stat_statements: Enable this extension to track query execution statistics. Identify your slowest and most frequent queries.
- pg_stat_user_tables: Monitor sequential scans, index scans, dead tuples, and table size growth.
- pg_stat_activity: Track active connections, idle connections, and long-running queries.
- Custom dashboards: Build Grafana dashboards that show queries per second, average query duration, cache hit ratio (should be above 99%), and replication lag.
The cache hit ratio is your most important single metric. If it drops below 99%, your database is reading from disk too frequently. Either add more RAM or optimize your queries to access less data.
Scaling Beyond a Single Server
When a single PostgreSQL server reaches its limits, you have several options:
Read Replicas
For read-heavy applications (most Indian e-commerce and SaaS products), add one or more read replicas. Route analytics queries, search queries, and reporting to replicas. Keep the primary server for writes and real-time reads.
Caching Layer
Add Redis as a caching layer between your application and PostgreSQL. Cache frequently accessed data - user sessions, product catalogs, configuration - with appropriate TTLs. A well-implemented cache layer reduces database load by 60-80%.
Vertical Scaling
Before adding complexity with horizontal scaling, consider vertical scaling. Moving from an RDS db.r6g.xlarge to db.r6g.4xlarge quadruples your CPU and RAM. For many Indian applications, this provides 2-3 more years of headroom at a fraction of the complexity cost.
Real-World Impact
To illustrate the impact, consider a typical Bengaluru e-commerce platform handling 50,000 daily transactions. Applying these tuning techniques can produce results like:
- Average query response time: 180ms down to 12ms
- 99th percentile query time: 2.8 seconds down to 95ms
- Database CPU utilization: 85% down to 35%
- Monthly RDS cost: reduced by 40% through right-sizing after optimization
These improvements also impact cloud infrastructure costs. Lower database CPU means smaller instances, and reduced query times mean fewer application server timeouts and retries.
PostgreSQL performance tuning is not a one-time task. As your data grows and your query patterns evolve, revisit these optimizations quarterly. The database that performs well today will need attention in six months as your Indian application scales to new heights.
For production AI systems that depend on PostgreSQL - vector search with pgvector, embedding storage, conversation logging - these tuning principles become even more critical. A slow database makes even the best AI model feel sluggish to end users.
Related Posts
How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search
How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search Retrieval-Augmented Generation (RAG) has become the dominant architecture pattern...
How AI is Transforming Small Businesses in Bengaluru
How AI is Transforming Small Businesses in Bengaluru Bengaluru has always been India's technology capital, but until recently, the benefits of cutting-edge t...
Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026
Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026 Running a restaurant in Bengaluru in 2026 means juggling GST compliance, online delivery...