Chandra
Database / Query Execution & Optimization

Query Execution & Optimization

Query Pipeline

Every SQL query passes through these stages:

flowchart LR
    SQL["SELECT ..."] --> Parser
    Parser -->|Parse Tree| Rewriter
    Rewriter -->|Query Tree| Planner
    Planner -->|Plan Tree| Executor
    Executor -->|Tuples| Result

1. Parser: Converts SQL text to an AST (parse tree). Validates syntax, resolves table/column names, checks permissions.

2. Rewriter: Applies semantic transformations — expands views, unfolds nested queries, simplifies expressions, applies rules (PostgreSQL rules system).

3. Planner (Optimizer): The most complex stage. Generates multiple execution plans and picks the cheapest (cost-based optimization).

4. Executor: Executes the chosen plan. Pulls tuples through the plan tree (Volcano-style iterator model).

Scan Methods

Scan TypeDescriptionWhen Used
Sequential ScanRead all rows sequentiallyNo index, large portion of table needed
Index ScanWalk B-Tree → fetch heap tupleHighly selective queries
Index-Only ScanRead from index alone, no heap fetchIndex covers all needed columns
Bitmap ScanMultiple index scans → bitmap → heap fetchCombination of conditions, moderate selectivity
TID ScanDirect row access by CTIDFrom subquery or WITH (known row location)

Each scan method has a cost formula that multiplies the number of pages read by a cost factor (sequential vs random I/O). The optimizer picks the cheapest based on table statistics.

Join Algorithms

Nested Loop Join

for each row in outer (smaller) relation:
    for each row in inner (larger) relation:
        if match: emit joined row
VariantComplexityWhen Used
PlainO(n*m)Small outer, no index on inner
IndexedO(n * log m)Small outer, index on inner join key
MaterializedO(n*m)Small outer, inner materialized in memory

Best for: Small outer relation (<1000 rows), especially with an index on the inner relation.

Hash Join

1. Build: Scan inner relation, build hash table on join key
2. Probe: Scan outer relation, probe hash table for matches
PhaseMemoryComplexity
BuildO(min(n,m)) hash table sizeO(m)
ProbeOngoingO(n)
Grace Hash Join (disk)When exceeds work_memO(n+m) writes/reads to disk

Best for: Equi-joins on unsorted data where one side can fit in memory.

Merge Join

1. Sort both relations on join key (or use existing order)
2. Iterate both sorted streams in parallel, emitting matches
PhaseMemoryComplexity
SortO(n+m) if not pre-sortedO(n log n + m log m)
MergeO(1)O(n+m)

Best for: Large tables already sorted on join key (e.g., from index order, or when ORDER BY matches join key).

Parallel Query Execution

Modern databases distribute query execution across multiple CPU cores:

graph TD
    Leader[Parallel Leader<br/>gathers results]
    Leader --> W1[Worker 1<br/>Partial scan/join]
    Leader --> W2[Worker 2<br/>Partial scan/join]
    Leader --> W3[Worker 3<br/>Partial scan/join]
    W1 --> PG[Partial Aggregate<br/>count=30]
    W2 --> PG2[Partial Aggregate<br/>count=25]
    W3 --> PG3[Partial Aggregate<br/>count=28]
    PG & PG2 & PG3 --> Final[Final Aggregate<br/>count=83]
OperatorParallelismNotes
Seq ScanMultiple workers scan page rangesEach worker reads a portion of the table
Index ScanMultiple workers scan index rangesRange-based partitioning
Hash JoinBuild hash in parallel, probe in parallelShared hash or partitioned hash
AggregatePartial → gather → finalTwo-phase aggregation
SortMerge sort: each worker sorts a chunkGATHER MERGE sorts final output

Limitations:

  • Only sequential scans, index scans, joins, and aggregates can be parallelized
  • CTEs (WITH) are optimization fences in some databases
  • Small tables or queries with LIMIT often don’t use parallel plans (overhead > benefit)

Cost Estimation

Databases use table and column statistics to estimate the cost of each plan. The optimizer computes a cost for each possible plan node based on estimated row counts, I/O costs (sequential vs random), and CPU costs. Each database engine has its own cost model and system catalogs for statistics.

Common Query Optimizations

| Problem | Symptom | Fix | |---|---|---|---| | Missing index | Table scan on large table | Add index on WHERE/JOIN columns | | Wrong join order | Nested loop on large tables | Update statistics, check join column indexes | | Stale statistics | Suboptimal plan choice | Refresh table statistics (ANALYZE-equivalent) | | Subquery repeated | Same subquery executed N times | Rewrite as join or use CTE |

Index Design Heuristics

  1. Prefix equality columns in compound indexes: WHERE a = 1 AND b > 5 → index on (a, b)
  2. Covering indexes: Include all queried columns to enable index-only scans
  3. Partial indexes: CREATE INDEX ... WHERE status = 'active' — smaller and faster
  4. Included columns (SQL Server, PostgreSQL 11+): Non-key columns in index for covering