Why We Do Indexing: Not Just for Speed, but for Cost and Hardware Reality
The Missing Layer: Disk I/O Is the Real Bottleneck
Indexing Reduces Disk Access — Not Just CPU Time
Disk Has a Cost — and a Lifespan
Cloud Reality: I/O Is Money
A Concrete Example
The Trade-Off: Index Maintenance Is Worth It
A Better Way to Answer “Why Indexing?”
Final Takeaway

Why We Do Indexing: Not Just for Speed, but for Cost and Hardware Reality

As software engineers, we are often asked a familiar question: why do we use database indexing?

The textbook answer is straightforward:

To make queries faster
To reduce time complexity from O(n) to O(log n)
By using data structures such as B-trees or B+-trees
At the cost of extra storage and maintenance during inserts, updates, and deletes

All of this is true. But it is not the full story.

There is a deeper, more practical reason behind indexing — one that matters greatly in real systems, cloud databases, and production costs.

The Missing Layer: Disk I/O Is the Real Bottleneck

Databases ultimately store data on disk.

Even in modern systems with caching and memory layers:

Memory access ≈ nanoseconds
Disk access ≈ milliseconds

That difference is millions of times slower.

When a query has no usable index, the database must:

Scan rows
Fetch pages from disk
Perform repeated I/O operations

Each of these steps translates into physical disk reads.

Indexing Reduces Disk Access — Not Just CPU Time

Indexes allow the database engine to:

Jump directly to relevant data pages
Read far fewer disk blocks
Avoid full table scans

This means:

Fewer disk reads
Less I/O pressure
Better cache utilization
Lower latency

In other words, indexing is fundamentally an I/O optimization, not just a CPU optimization.

Disk Has a Cost — and a Lifespan

Every disk read:

Consumes I/O bandwidth
Adds wear (especially on SSD-based systems)
Competes with other workloads

On physical hardware, excessive I/O:

Shortens disk lifespan
Increases operational risk

In cloud environments, it directly affects billing.

Cloud Reality: I/O Is Money

In managed databases such as Amazon Aurora, pricing includes:

Storage
Compute
I/O operations

Aurora charges based on:

The number of disk read and write requests

So when a query:

Scans millions of rows
Performs unnecessary disk reads
Repeats this thousands or millions of times

You are not just slowing down your system — you are burning money.

A Concrete Example

Imagine a legacy system with:

Poor or missing indexes
Heavy read traffic
Analytical or reporting queries

If that system generates:

100 billion I/O requests per month

Even a small per-request cost becomes enormous at that scale.

Now introduce:

Proper indexing
Query rewrites
Better access patterns

If indexing reduces disk reads by 80–90%:

Query latency drops
CPU usage stabilizes
Cloud cost drops dramatically

This is not theoretical. This happens all the time in real production systems.

The Trade-Off: Index Maintenance Is Worth It

Yes, indexing has costs:

Extra storage
Slower writes
Index maintenance during INSERT / UPDATE / DELETE

But in most real-world systems:

Reads vastly outnumber writes
Cloud databases charge for I/O
Disk access dominates cost and latency

So the trade-off strongly favors indexing.

A Better Way to Answer “Why Indexing?”

Instead of saying, “Indexing makes queries faster,” a stronger answer is:

Indexing reduces disk I/O, which improves performance, extends hardware life, and significantly lowers cloud costs.

Final Takeaway

Indexing exists because:

Disk access is slow
Disk access is expensive
Disk access is limited
Disk access is billed in the cloud

Speed is just the visible benefit. Cost and hardware reality are the hidden reasons.

Table of Contents