An optional accelerator that lets a query jump straight to the rows it needs instead of scanning the whole table. Indexes never cause wrong answers, only faster ones. They are read by the DeltaForge planner; the parent data stays in standard Delta format.
A managed companion to a Delta table
For each indexed column, the index records where the matching rows live so the engine can read just those rows rather than the whole table. The index is itself a child Delta table.
Indexes are consumed by the DeltaForge query planner. Other engines reading the parent Delta table will not pick them up; they fall back to the standard scan path and return the same results.
Indexes complement built-in data skipping; they don't replace it
An index is a running expense, not a free upgrade
Storage
A small fraction of the parent table's size
Write overhead
Every parent write also updates the index when auto-update is on
Build time
One-time scan of the parent at index creation
Pick by access pattern
A learned index built from a piecewise geometric model over the key distribution. Compact on disk; suited to clustered or monotonic keys typical of analytical workloads.
Classic balanced tree with predictable behavior across any key distribution. Choose with USING btree when the data is unsorted or highly random.
File-level probabilistic test. Each Parquet file carries a bloom filter for the indexed columns; the planner skips files whose filter rejects the predicate. Tunable fpp and num_items.
Documented in detail in the architecture reference.