A DIM index materializes the distinct values and per-value row counts of selected Delta table columns. The query engine loads each column into a process-wide Arrow cache.

Which Power BI queries use the DIM index?

It serves single-column distinct-value, group-by count, minimum, maximum, and cardinality probe shapes commonly generated by DirectQuery slicers and filter panels.

Does a DIM index cache the entire fact table?

No. It caches only the distinct values and stored counts for the indexed columns. Measures, joins, and multi-column filters continue through the normal query engine.

What happens after the Delta table changes?

A stale index is not used for a mismatched table snapshot. Queries fall back to the normal engine path until the index is rebuilt.

Serve Power BI Slicers from Memory with a DIM Index

Why Power BI needs this

In DirectQuery mode, slicers and filter panels repeatedly ask for a column's members. Without a DIM index, each request can scan the parent files even though the answer changes infrequently.

SELECT region, COUNT(*)
FROM sales.orders
GROUP BY region;

This common slicer shape can be answered from the DIM cache when only one indexed column is referenced.

Create one index for several slicers

CREATE DIM INDEX idx_pbi_slicers
ON TABLE sales.orders (
    region,
    status,
    sales_channel
)
WITH (cardinality_cap = 100000);

Each indexed column gets an independent cache entry. The cardinality cap protects against accidentally indexing a near-unique column.

Queries served from memory

The fast path covers the single-column query shapes Power BI uses to populate slicers and inspect column ranges:

SELECT DISTINCT region
FROM sales.orders;

SELECT status, COUNT(*)
FROM sales.orders
GROUP BY status;

SELECT MIN(order_date), MAX(order_date)
FROM sales.orders;

After the first load, the cached Arrow batch is reused without reading the parent Parquet files.

What stays on the normal engine path

A DIM index is intentionally narrow. Measures, joins, multi-column groups, and filters that reference another column continue through the regular SQL engine:

SELECT region, SUM(total_amount)
FROM sales.orders
WHERE status = 'complete'
GROUP BY region;

This query references three columns, so it is not answered from the DIM cache.

Confirm the cache is used

EXPLAIN
SELECT DISTINCT region
FROM sales.orders;

A matching query shows a DimCacheServe plan with the table, index, column, query shape, and versions.

Rebuild after table changes

The DIM index is version-bound. If the parent snapshot changes and the index is no longer current, DeltaForge falls back to the normal query path rather than returning stale data.

REBUILD INDEX idx_pbi_slicers
ON TABLE sales.orders;

Schedule a rebuild after data loads to refresh values and restore exact per-value counts.

Choose columns carefully

Good: region, category, status, channel, segment, and other slicer columns.
Avoid: order IDs, transaction IDs, timestamps, and other near-unique columns.
Partition columns may already use a metadata-only fast path without a DIM index.

The result

Power BI still queries the live Delta table through DirectQuery. The repetitive dimension-envelope requests are served from engine memory, while analytical measures continue to use the full query engine against the current table snapshot.