Create the index

Index the key used by targeted writes and enable incremental maintenance:

CREATE INDEX idx_orders_id
ON TABLE sales.orders (order_id)
USING btree
WITH (auto_update = true);
The index is stored as a child Delta table under the parent table, not inside a proprietary replacement format.

Run a targeted update

The index helps the engine locate the matching row. The update itself still commits through normal Delta Lake transaction semantics.

UPDATE sales.orders
SET status = 'shipped',
    shipped_at = CURRENT_TIMESTAMP
WHERE order_id = 987654321;
This is most useful when the predicate matches one row or a small set of rows in a large table.

The same index helps MERGE and DELETE

MERGE INTO sales.orders AS target
USING staging.order_updates AS source
ON target.order_id = source.order_id
WHEN MATCHED THEN UPDATE SET
    status = source.status;
The expensive locate step can use the indexed key instead of searching every parent file.

Keep the index current

With auto_update = true, each parent commit incrementally re-indexes changed files. For bulk loads, manual maintenance can be cheaper:

ALTER INDEX idx_orders_id
ON TABLE sales.orders
SET (auto_update = false);

INSERT INTO sales.orders
SELECT * FROM staging.orders_today;

REBUILD INDEX idx_orders_id
ON TABLE sales.orders;
A rebuild scans the parent table and replaces the child index atomically.

When a B+ tree fits

  • Point updates by order, customer, device, or transaction ID.
  • Narrow range lookups on uneven or random key distributions.
  • Repeated MERGE operations on the same business key.
  • Selective DELETE statements on large Delta tables.

Indexes add storage and write cost. They do not help broad scans that touch most of the table, and small tables may already be cheap to scan.

Inspect the index

DESCRIBE INDEXES ON TABLE sales.orders;
The result shows the algorithm, indexed columns, auto-update setting, parent version, leaf count, and size.

Why this matters

Delta Lake is excellent at analytical scans, but a selective update should not need an analytical scan just to find one row. A row-level B+ tree adds that lookup path while leaving the underlying Delta table interoperable.