DatabasePostgreSQLPerformance

PostgreSQL Performance Tuning for High-Traffic Indian Applications

Yoginth13 January 20269 min read

PostgreSQL Performance Tuning for High-Traffic Indian Applications

PostgreSQL is the database of choice for serious Indian startups and enterprises, and for good reason. It is reliable, feature-rich, open source, and handles complex queries that MySQL struggles with. But PostgreSQL performance tuning becomes critical once your application starts handling thousands of concurrent users, during a Diwali sale, a viral product launch, or simply steady growth that pushes your database past its default configuration.

This guide covers the practical PostgreSQL performance tuning techniques that make the difference between a database that handles 100 requests per second and one that handles 10,000. These are proven techniques used in Indian e-commerce platforms, fintech applications, and SaaS products that serve millions of users.

The First 80%: Configuration That Matters

PostgreSQL ships with conservative default settings designed to run on minimal hardware. For a production Indian application running on a 16 GB RAM, 4-core server (a common RDS configuration), here are the settings that have the biggest impact:

Memory Settings

shared_buffers: Set to 25% of total RAM. For a 16 GB server, this is 4 GB. This is PostgreSQL's main cache. Data pages read from disk are stored here for fast access.

effective_cache_size: Set to 75% of total RAM (12 GB for our example). This does not allocate memory; it tells the query planner how much cache is likely available, helping it choose better query plans.

work_mem: Start with 64 MB. This is per-operation memory for sorts and hash joins. Be careful: if you have 100 concurrent connections and each runs a query with 3 sort operations, that is 100 x 3 x 64 MB = 19.2 GB of potential memory usage. Monitor and adjust based on your actual query patterns.

maintenance_work_mem: Set to 512 MB to 1 GB. This speeds up VACUUM, CREATE INDEX, and ALTER TABLE operations. Since these run infrequently, you can be generous here.

WAL and Checkpoint Settings

wal_buffers: Set to 64 MB. The default is far too small for write-heavy workloads.

checkpoint_completion_target: Set to 0.9 (default is 0.5). This spreads checkpoint I/O over a longer period, reducing I/O spikes.

max_wal_size: Set to 2 GB to 4 GB. This reduces checkpoint frequency, which is particularly important for write-heavy Indian applications (e-commerce order processing, logging, analytics).

Connection Settings

max_connections: Keep this lower than you think, 100 to 200 for most applications. Each connection consumes approximately 10 MB of RAM. Use a connection pooler instead of increasing this value.

Indexing: The Single Biggest Performance Lever

Proper indexing is the difference between a query that takes 2 seconds and one that takes 2 milliseconds. Here are the indexing strategies that matter most for Indian applications:

Composite Indexes for Multi-Column Queries

If your application frequently queries orders by user_id and status, a composite index is far more effective than two separate indexes:

CREATE INDEX idx_orders_user_status ON orders (user_id, status);

Column order matters. Put the most selective column first. If you have 10,000 users but only 5 order statuses, user_id should come first.

Partial Indexes for Common Filters

Indian e-commerce applications often query for active or pending records. A partial index on just those records is smaller and faster:

CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';

This index is a fraction of the size of a full index and is used automatically when your query includes WHERE status = 'pending'.

GIN Indexes for JSONB

If you store product attributes, user preferences, or configuration in JSONB columns (common in Indian applications that handle diverse product categories), GIN indexes make queries against JSONB data fast:

CREATE INDEX idx_products_attributes ON products USING GIN (attributes);

Index Maintenance

Indexes degrade over time as data is inserted, updated, and deleted. Schedule regular REINDEX operations during off-peak hours (typically 2-5 AM IST for Indian applications). Monitor index bloat using the pgstattuple extension.

Query Optimization Techniques

Use EXPLAIN ANALYZE Religiously

Every slow query should be analyzed with EXPLAIN ANALYZE. Look for:

Seq Scan on large tables: This usually means a missing index.
Nested Loop with high row counts: Consider if a hash join or merge join would be more efficient.
Sort operations spilling to disk: Increase work_mem or add an index that provides sorted output.

Avoid SELECT *

Always specify the columns you need. This reduces I/O, memory usage, and network transfer. For wide tables with 30+ columns (common in Indian enterprise applications), the difference is dramatic.

Batch Operations

Instead of inserting or updating rows one at a time (a pattern we see constantly in Indian applications), use batch operations:

INSERT INTO orders (user_id, product_id, quantity)
VALUES
  (1, 101, 2),
  (2, 102, 1),
  (3, 103, 5);

Batch sizes of 500-1000 rows provide the best throughput without excessive memory usage.

CTEs vs Subqueries

PostgreSQL 12+ inlines CTEs when possible, but be aware of the optimizer's choices. For complex queries, test both CTEs and subqueries with EXPLAIN ANALYZE and use whichever produces a better plan.

Connection Pooling: Non-Negotiable for Production

PostgreSQL creates a new process for each connection. With 500 simultaneous users, that is 500 PostgreSQL processes, each consuming 10+ MB of RAM. Connection pooling is not optional for high-traffic applications.

PgBouncer

PgBouncer is the standard connection pooler for PostgreSQL. Configure it in transaction mode for maximum connection reuse:

Set pool_size to 20-50 per database (significantly less than your application's connection count)
Set max_client_conn to match your application's needs (500-2000)
PgBouncer itself uses minimal resources. A single instance can handle thousands of client connections

The result: your application sees 500 connections, but PostgreSQL only manages 50. This dramatically reduces memory usage and context switching overhead.

Application-Level Pooling

In addition to PgBouncer, configure connection pooling in your application framework. For Node.js, use pg-pool with a pool size of 10-20. For Python, SQLAlchemy's connection pool should be configured similarly. These pools connect to PgBouncer, which then connects to PostgreSQL, creating a two-tier pooling system that handles traffic spikes gracefully.

Partitioning for Large Tables

Indian applications that store transactional data grow fast. An e-commerce platform processing 10,000 orders per day accumulates millions of rows within a year. Partitioning keeps queries fast:

Range Partitioning by Date

The most common and effective strategy for Indian applications:

CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY,
  user_id BIGINT NOT NULL,
  total DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2026_q1 PARTITION OF orders
  FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');

Queries that filter by date automatically scan only the relevant partitions. Archiving old data becomes trivial. Detach the partition and move it to cold storage.

Vacuum and Autovacuum Tuning

PostgreSQL's MVCC architecture means that deleted and updated rows leave dead tuples. VACUUM reclaims this space. The default autovacuum settings are too conservative for high-write applications:

autovacuum_vacuum_scale_factor: Reduce from 0.2 to 0.05. This triggers vacuum after 5% of the table has dead tuples instead of 20%.
autovacuum_analyze_scale_factor: Reduce from 0.1 to 0.02. This updates table statistics more frequently, helping the query planner.
autovacuum_max_workers: Increase from 3 to 5-8 for databases with many tables.
autovacuum_naptime: Reduce from 60 seconds to 15-30 seconds for faster response to dead tuple accumulation.

For tables with extremely high write rates (logging tables, analytics events), consider setting per-table autovacuum parameters that are more aggressive than the global defaults.

Monitoring PostgreSQL Performance

You cannot tune what you do not measure. Essential monitoring for production PostgreSQL:

pg_stat_statements: Enable this extension to track query execution statistics. Identify your slowest and most frequent queries.
pg_stat_user_tables: Monitor sequential scans, index scans, dead tuples, and table size growth.
pg_stat_activity: Track active connections, idle connections, and long-running queries.
Custom dashboards: Build Grafana dashboards that show queries per second, average query duration, cache hit ratio (should be above 99%), and replication lag.

The cache hit ratio is your most important single metric. If it drops below 99%, your database is reading from disk too frequently. Either add more RAM or optimize your queries to access less data.

Scaling Beyond a Single Server

When a single PostgreSQL server reaches its limits, you have several options:

Read Replicas

For read-heavy applications (most Indian e-commerce and SaaS products), add one or more read replicas. Route analytics queries, search queries, and reporting to replicas. Keep the primary server for writes and real-time reads.

Caching Layer

Add Redis as a caching layer between your application and PostgreSQL. Cache frequently accessed data - user sessions, product catalogs, configuration - with appropriate TTLs. A well-implemented cache layer reduces database load by 60-80%.

Vertical Scaling

Before adding complexity with horizontal scaling, consider vertical scaling. Moving from an RDS db.r6g.xlarge to db.r6g.4xlarge quadruples your CPU and RAM. For many Indian applications, this provides 2-3 more years of headroom at a fraction of the complexity cost.

Real-World Impact

To illustrate the impact, consider a typical Bengaluru e-commerce platform handling 50,000 daily transactions. Applying these tuning techniques can produce results like:

Average query response time: 180ms down to 12ms
99th percentile query time: 2.8 seconds down to 95ms
Database CPU utilization: 85% down to 35%
Monthly RDS cost: reduced by 40% through right-sizing after optimization

These improvements also impact cloud infrastructure costs. Lower database CPU means smaller instances, and reduced query times mean fewer application server timeouts and retries.

PostgreSQL performance tuning is not a one-time task. As your data grows and your query patterns evolve, revisit these optimizations quarterly. The database that performs well today will need attention in six months as your Indian application scales to new heights.

For production AI systems that depend on PostgreSQL - vector search with pgvector, embedding storage, conversation logging - these tuning principles become even more critical. A slow database makes even the best AI model feel sluggish to end users.

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search Retrieval-Augmented Generation (RAG) has become the dominant architecture pattern...

8 March 202610 min read

How AI is Transforming Small Businesses in Bengaluru

How AI is Transforming Small Businesses in Bengaluru Bengaluru has always been India's technology capital, but until recently, the benefits of cutting-edge t...

1 March 20266 min read

Local Business

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026 Running a restaurant in Bengaluru in 2026 means juggling GST compliance, online delivery...

25 February 20267 min read

DatabasePostgreSQLPerformance

PostgreSQL Performance Tuning for High-Traffic Indian Applications

Yoginth13 January 20269 min read

PostgreSQL Performance Tuning for High-Traffic Indian Applications

The First 80%: Configuration That Matters

Memory Settings

shared_buffers: Set to 25% of total RAM. For a 16 GB server, this is 4 GB. This is PostgreSQL's main cache. Data pages read from disk are stored here for fast access.

maintenance_work_mem: Set to 512 MB to 1 GB. This speeds up VACUUM, CREATE INDEX, and ALTER TABLE operations. Since these run infrequently, you can be generous here.

WAL and Checkpoint Settings

wal_buffers: Set to 64 MB. The default is far too small for write-heavy workloads.

checkpoint_completion_target: Set to 0.9 (default is 0.5). This spreads checkpoint I/O over a longer period, reducing I/O spikes.

max_wal_size: Set to 2 GB to 4 GB. This reduces checkpoint frequency, which is particularly important for write-heavy Indian applications (e-commerce order processing, logging, analytics).

Connection Settings

max_connections: Keep this lower than you think, 100 to 200 for most applications. Each connection consumes approximately 10 MB of RAM. Use a connection pooler instead of increasing this value.

Indexing: The Single Biggest Performance Lever

Proper indexing is the difference between a query that takes 2 seconds and one that takes 2 milliseconds. Here are the indexing strategies that matter most for Indian applications:

Composite Indexes for Multi-Column Queries

If your application frequently queries orders by user_id and status, a composite index is far more effective than two separate indexes:

CREATE INDEX idx_orders_user_status ON orders (user_id, status);

Column order matters. Put the most selective column first. If you have 10,000 users but only 5 order statuses, user_id should come first.

Partial Indexes for Common Filters

Indian e-commerce applications often query for active or pending records. A partial index on just those records is smaller and faster:

CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';

This index is a fraction of the size of a full index and is used automatically when your query includes WHERE status = 'pending'.

GIN Indexes for JSONB

CREATE INDEX idx_products_attributes ON products USING GIN (attributes);

Index Maintenance

Query Optimization Techniques

Use EXPLAIN ANALYZE Religiously

Every slow query should be analyzed with EXPLAIN ANALYZE. Look for:

Seq Scan on large tables: This usually means a missing index.
Nested Loop with high row counts: Consider if a hash join or merge join would be more efficient.
Sort operations spilling to disk: Increase work_mem or add an index that provides sorted output.

Avoid SELECT *

Always specify the columns you need. This reduces I/O, memory usage, and network transfer. For wide tables with 30+ columns (common in Indian enterprise applications), the difference is dramatic.

Batch Operations

Instead of inserting or updating rows one at a time (a pattern we see constantly in Indian applications), use batch operations:

INSERT INTO orders (user_id, product_id, quantity)
VALUES
  (1, 101, 2),
  (2, 102, 1),
  (3, 103, 5);

Batch sizes of 500-1000 rows provide the best throughput without excessive memory usage.

CTEs vs Subqueries

PostgreSQL 12+ inlines CTEs when possible, but be aware of the optimizer's choices. For complex queries, test both CTEs and subqueries with EXPLAIN ANALYZE and use whichever produces a better plan.

Connection Pooling: Non-Negotiable for Production

PgBouncer

PgBouncer is the standard connection pooler for PostgreSQL. Configure it in transaction mode for maximum connection reuse:

Set pool_size to 20-50 per database (significantly less than your application's connection count)
Set max_client_conn to match your application's needs (500-2000)
PgBouncer itself uses minimal resources. A single instance can handle thousands of client connections

The result: your application sees 500 connections, but PostgreSQL only manages 50. This dramatically reduces memory usage and context switching overhead.

Application-Level Pooling

Partitioning for Large Tables

Indian applications that store transactional data grow fast. An e-commerce platform processing 10,000 orders per day accumulates millions of rows within a year. Partitioning keeps queries fast:

Range Partitioning by Date

The most common and effective strategy for Indian applications:

CREATE TABLE orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY,
  user_id BIGINT NOT NULL,
  total DECIMAL(10,2) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2026_q1 PARTITION OF orders
  FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');

Queries that filter by date automatically scan only the relevant partitions. Archiving old data becomes trivial. Detach the partition and move it to cold storage.

Vacuum and Autovacuum Tuning

PostgreSQL's MVCC architecture means that deleted and updated rows leave dead tuples. VACUUM reclaims this space. The default autovacuum settings are too conservative for high-write applications:

autovacuum_vacuum_scale_factor: Reduce from 0.2 to 0.05. This triggers vacuum after 5% of the table has dead tuples instead of 20%.
autovacuum_analyze_scale_factor: Reduce from 0.1 to 0.02. This updates table statistics more frequently, helping the query planner.
autovacuum_max_workers: Increase from 3 to 5-8 for databases with many tables.
autovacuum_naptime: Reduce from 60 seconds to 15-30 seconds for faster response to dead tuple accumulation.

For tables with extremely high write rates (logging tables, analytics events), consider setting per-table autovacuum parameters that are more aggressive than the global defaults.

Monitoring PostgreSQL Performance

You cannot tune what you do not measure. Essential monitoring for production PostgreSQL:

pg_stat_statements: Enable this extension to track query execution statistics. Identify your slowest and most frequent queries.
pg_stat_user_tables: Monitor sequential scans, index scans, dead tuples, and table size growth.
pg_stat_activity: Track active connections, idle connections, and long-running queries.
Custom dashboards: Build Grafana dashboards that show queries per second, average query duration, cache hit ratio (should be above 99%), and replication lag.

The cache hit ratio is your most important single metric. If it drops below 99%, your database is reading from disk too frequently. Either add more RAM or optimize your queries to access less data.

Scaling Beyond a Single Server

When a single PostgreSQL server reaches its limits, you have several options:

Read Replicas

Caching Layer

Vertical Scaling

Real-World Impact

To illustrate the impact, consider a typical Bengaluru e-commerce platform handling 50,000 daily transactions. Applying these tuning techniques can produce results like:

Average query response time: 180ms down to 12ms
99th percentile query time: 2.8 seconds down to 95ms
Database CPU utilization: 85% down to 35%
Monthly RDS cost: reduced by 40% through right-sizing after optimization

These improvements also impact cloud infrastructure costs. Lower database CPU means smaller instances, and reduced query times mean fewer application server timeouts and retries.

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search Retrieval-Augmented Generation (RAG) has become the dominant architecture pattern...

8 March 202610 min read

How AI is Transforming Small Businesses in Bengaluru

How AI is Transforming Small Businesses in Bengaluru Bengaluru has always been India's technology capital, but until recently, the benefits of cutting-edge t...

1 March 20266 min read

Local Business

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026 Running a restaurant in Bengaluru in 2026 means juggling GST compliance, online delivery...

25 February 20267 min read

PostgreSQL Performance Tuning for High-Traffic Indian Applications

The First 80%: Configuration That Matters

Memory Settings

WAL and Checkpoint Settings

Connection Settings

Indexing: The Single Biggest Performance Lever

Composite Indexes for Multi-Column Queries

Partial Indexes for Common Filters

GIN Indexes for JSONB

Index Maintenance

Query Optimization Techniques

Use EXPLAIN ANALYZE Religiously

Avoid SELECT *

Batch Operations

CTEs vs Subqueries

Connection Pooling: Non-Negotiable for Production

PgBouncer

Application-Level Pooling

Partitioning for Large Tables

Range Partitioning by Date

Vacuum and Autovacuum Tuning

Monitoring PostgreSQL Performance

Scaling Beyond a Single Server

Read Replicas

Caching Layer

Vertical Scaling

Real-World Impact

Related Posts

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search

How AI is Transforming Small Businesses in Bengaluru

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026

PostgreSQL Performance Tuning for High-Traffic Indian Applications

The First 80%: Configuration That Matters

Memory Settings

WAL and Checkpoint Settings

Connection Settings

Indexing: The Single Biggest Performance Lever

Composite Indexes for Multi-Column Queries

Partial Indexes for Common Filters

GIN Indexes for JSONB

Index Maintenance

Query Optimization Techniques

Use EXPLAIN ANALYZE Religiously

Avoid SELECT *

Batch Operations

CTEs vs Subqueries

Connection Pooling: Non-Negotiable for Production

PgBouncer

Application-Level Pooling

Partitioning for Large Tables

Range Partitioning by Date

Vacuum and Autovacuum Tuning

Monitoring PostgreSQL Performance

Scaling Beyond a Single Server

Read Replicas

Caching Layer

Vertical Scaling

Real-World Impact

Related Posts

How to Build a Production RAG Pipeline: LLMs, Embeddings, and Vector Search

How AI is Transforming Small Businesses in Bengaluru

Why Every Bengaluru Restaurant Needs a Custom Billing System in 2026