Why Your Bulk Data Load Slows Down Over Time: The Hidden Cost of Index Maintenance

Introduction

Have you ever started a bulk data load that ran lightning-fast initially, only to watch it grind to a crawl hours later? You're not alone. This is one of the most common—and most misunderstood—performance issues in Oracle databases.

In this post, I'll share a real-world case study where a data load that started at 550,000 rows/hour degraded to just 40,000 rows/hour—a 93% performance drop. More importantly, I'll explain exactly why this happens and how to fix it.

The Scenario

Our team was loading data into a staging table as part of a nightly ETL process. The setup looked like this:

Table Configuration:

CREATE TABLE STG_CUSTOMER_EXTRACT (
    CUSTOMER_ID       NUMBER,
    ACCOUNT_NUMBER    VARCHAR2(50),
    BILLING_ID        VARCHAR2(30),
    REGION_CODE       VARCHAR2(10),
    LOAD_DATE         DATE
);

-- Three indexes to support downstream queries
CREATE INDEX IDX_CUSTOMER_ID ON STG_CUSTOMER_EXTRACT(CUSTOMER_ID);
CREATE INDEX IDX_ACCOUNT_NUM ON STG_CUSTOMER_EXTRACT(ACCOUNT_NUMBER);
CREATE INDEX IDX_BILLING_ID ON STG_CUSTOMER_EXTRACT(BILLING_ID);

-- Three indexes to support downstream queries

CREATE INDEX IDX_CUSTOMER_ID ON STG_CUSTOMER_EXTRACT(CUSTOMER_ID);

CREATE INDEX IDX_ACCOUNT_NUM ON STG_CUSTOMER_EXTRACT(ACCOUNT_NUMBER);

CREATE INDEX IDX_BILLING_ID ON STG_CUSTOMER_EXTRACT(BILLING_ID);

Attribute	Value
Target Row Count	87 million
Table Size	~8 GB
Number of Indexes	3 B-tree indexes
Load Method	Row-by-row INSERT

Nothing unusual, right? Just a standard staging table with some indexes. What could go wrong?

The Observation: A Tale of Two Performance Profiles

We monitored the load throughout the night and observed something alarming:

Time	Rows Loaded	Insert Rate	Avg Time/Insert	Status
3:00 PM	550,000	550K/hour	0.003 sec	✅ Great
4:00 PM	800,000	248K/hour	0.011 sec	⚠️ Slowing
5:00 PM	850,000	51K/hour	0.066 sec	🔴 Problem
9:00 PM	2.1M	44K/hour	0.078 sec	🔴 Critical
2:00 AM	3.2M	26K/hour	0.101 sec	🔴 Severe

The insert time increased by 3,367% (from 0.003s to 0.101s per row).

But here's what really caught our attention—the Logical I/O per insert also increased dramatically:

Time	Logical I/O per Insert
3:00 PM	561
5:00 PM	794
2:00 AM	805

Why would a simple INSERT require 42% more I/O as the load progressed?

The Root Cause: B-tree Index Maintenance

The answer lies in understanding how Oracle maintains B-tree indexes during INSERT operations.

What Happens on Every Single INSERT

When you insert one row into a table with 3 indexes, Oracle must:

Step 1: INSERT the row into a table data block → 2-3 I/Os

Step 2: For EACH of the 3 indexes, Oracle must:

Navigate from Root block

Traverse to correct Branch block

Find the correct Leaf block

Insert the index entry in sorted order

If the leaf block is full → SPLIT the block

Cost per index: 4-6 I/Os

Total I/O per INSERT: 14-21 logical I/Os (instead of just 2-3 without indexes)

Visual: The Hidden Work Behind One INSERT

INSERT 1 ROW INTO TABLE

↓

2-3 I/Os for table

INDEX #1
CUSTOMER_ID
(4-6 I/Os)

↓

Root Block

↓

Branch Block

↓

Leaf Block
INSERT HERE

INDEX #2
ACCOUNT_NUM
(4-6 I/Os)

↓

Root Block

↓

Branch Block

↓

Leaf Block
INSERT HERE

INDEX #3
BILLING_ID
(4-6 I/Os)

↓

Root Block

↓

Branch Block

↓

Leaf Block
INSERT HERE

Multiply this by 87 million rows, and you begin to see the problem.

Why Performance Degrades Over Time

Here's the critical insight: B-tree indexes grow taller as more data is inserted.

The Index Height Problem

As more data is inserted, indexes grow taller, requiring more I/O operations for each insert:

Milestone	Rows Loaded	Index Height	I/O per Insert	Leaf Block Splits	Insert Time	Status
Start	0 - 20M	2-3 levels	~6 I/Os	Rare	0.003s	✅ Fast
Early	20M - 50M	3-4 levels	~9 I/Os	Moderate	0.02s	⚠️ Slowing
Middle	50M - 87M	4 levels	~12 I/Os	Frequent	0.05s	🔴 Slow
End	87M+	4-5 levels	15-18 I/Os	Very Frequent	0.08-0.10s	🔴 Critical

Key Insight: Insert time increased by 33x (from 0.003s to 0.10s) as the index grew from height 2 to height 5!

Understanding the Progression

At 0-20M rows: Indexes are small (height 2-3). Navigation is fast. Block splits are rare.

At 20-50M rows: Indexes grow taller (height 3-4). More blocks to traverse. Block splits become common.

At 50-87M rows: Indexes are tall (height 4-5). Every insert requires navigating through multiple levels. Block splits are frequent.

The Block Split Problem

When an index leaf block is full and you need to insert a new entry, Oracle must split the block:

BEFORE SPLIT

Leaf Block (FULL) 🔴

A, B, C, D, E, F, G, H

Problem: Need to insert "C1" but block is full!

⟹

AFTER SPLIT

Original Leaf Block (50%)

A, B, C, D

NEW Leaf Block (50%)

C1, E, F, G, H

Result: One full block becomes two half-full blocks. The parent branch block must also be updated to point to both blocks.

Cost of Each Block Split

Operation	Additional I/Os	Redo Generated
Allocate new block	1	Yes
Write original block (half entries)	1	Yes
Write new block (other half)	1	Yes
Update parent branch block	2	Yes
Total	6+ I/Os	Significant

The Redo Log Cascade

Every index change generates redo log entries. Here's what a single INSERT produces:

// Redo entries generated per INSERT:

├─ Table block change
├─ Undo block for table change
├─ Index #1 leaf block change
├─ Index #1 undo entry
├─ Index #2 leaf block change
├─ Index #2 undo entry
├─ Index #3 leaf block change
├─ Index #3 undo entry
└─ Additional entries for any block splits

This explains the wait events we observed in our AWR report:

Wait Event	Increase from Baseline
log file sync	+993%
log file switch (checkpoint incomplete)	+548%
write complete waits	+7,329%

The database was drowning in redo log activity.

The Math: Why This Matters

Let's compare the total I/O for our 87-million row load:

❌ With Indexes (Current State)

Per Row:
• Table INSERT: 2 I/Os
• Index #1: 5 I/Os (avg)
• Index #2: 5 I/Os (avg)
• Index #3: 5 I/Os (avg)
• Block splits: 2 I/Os (avg)
Total: 19 I/Os per row
For 87M rows:
19 × 87,000,000 = 1.65 BILLION I/Os

✅ Without Indexes (Optimized)

Per Row:
• Table INSERT: 2 I/Os
Total: 2 I/Os per row
For 87M rows:
2 × 87,000,000 = 174M I/Os
Plus index rebuild (one-time):
~300 million I/Os
Grand Total: 474 million I/Os

Result: 71% reduction in total I/O

The Solution: Load First, Index Later

The fix is elegantly simple: disable indexes during the load, then rebuild them afterward.

Step 1: Prepare the Table (Pre-Load)

-- Reduce redo generation during load
ALTER TABLE STG_CUSTOMER_EXTRACT NOLOGGING;

-- Make indexes unusable (no maintenance during INSERT)
ALTER INDEX IDX_CUSTOMER_ID UNUSABLE;
ALTER INDEX IDX_ACCOUNT_NUM UNUSABLE;
ALTER INDEX IDX_BILLING_ID UNUSABLE;

Step 2: Load the Data

SQL*Loader (Recommended for Files)

sqlldr userid=etl_user/password \
    control=customer_extract.ctl \
    DIRECT=TRUE \
    PARALLEL=TRUE \
    ROWS=500000 \
    log=customer_extract.log

Step 3: Rebuild Indexes (Post-Load)

-- Rebuild indexes using parallel processing
ALTER INDEX IDX_CUSTOMER_ID REBUILD PARALLEL 8 NOLOGGING;
ALTER INDEX IDX_ACCOUNT_NUM REBUILD PARALLEL 8 NOLOGGING;
ALTER INDEX IDX_BILLING_ID REBUILD PARALLEL 8 NOLOGGING;

-- Reset to normal settings
ALTER TABLE STG_CUSTOMER_EXTRACT LOGGING;
ALTER INDEX IDX_CUSTOMER_ID LOGGING NOPARALLEL;
ALTER INDEX IDX_ACCOUNT_NUM LOGGING NOPARALLEL;
ALTER INDEX IDX_BILLING_ID LOGGING NOPARALLEL;

-- Gather fresh statistics
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'STG_CUSTOMER_EXTRACT');

Results: Before and After

Metric	Before (With Indexes)	After (Optimized)	Improvement
Insert rate	40,000/hour (degraded)	2,000,000+/hour	50x faster
Time per insert	0.08-0.10 sec	0.001-0.003 sec	30-100x faster
Total I/O	1.65 billion	474 million	71% less
Redo generated	Massive	Minimal	90%+ reduction
Total load time	10+ hours	45 min + 15 min rebuild	90% faster
Database impact	Severe contention	Minimal	Dramatically improved

When to Use This Approach

✅ Ideal Scenarios

Bulk loading millions of rows

Initial data migrations

Nightly ETL batch processes

Data warehouse staging loads

Any scenario where indexes aren't queried during load

❌ Not Recommended For

OLTP systems with concurrent reads

When unique constraints must be enforced row-by-row

Small loads (under 100K rows—overhead isn't worth it)

When foreign key constraints reference the indexed columns

Important Considerations

Recovery Implications

When using NOLOGGING:

Changes are not recoverable via archive logs

Take a backup immediately after the load completes

Alternatively, use LOGGING if point-in-time recovery is critical

Constraint Handling

If you have unique constraints:

Key Takeaways

Index maintenance is expensive - Each INSERT must update every index on the table.

The cost grows over time - As indexes get taller, navigation requires more I/O.

Block splits compound the problem - Full leaf blocks trigger expensive split operations.

Redo generation multiplies - Every index change generates redo log entries.

The solution is simple - Disable indexes during load, rebuild afterward.

Bulk operations win - Building an index once is far more efficient than maintaining it row-by-row.

Conclusion

The next time you're loading millions of rows and notice performance degrading over time, remember: your indexes might be the culprit.

The counterintuitive solution—removing indexes during load—can transform a 10-hour nightmare into a 1-hour success story. Your database (and your on-call rotation) will thank you.

DBA NAVIGATOR

Search This Blog