Why Power BI needs this

In DirectQuery mode, slicers and filter panels repeatedly ask for a column's members. Without a DIM index, each request can scan the parent files even though the answer changes infrequently.

SELECT region, COUNT(*)
FROM sales.orders
GROUP BY region;
This common slicer shape can be answered from the DIM cache when only one indexed column is referenced.

Create one index for several slicers

CREATE DIM INDEX idx_pbi_slicers
ON TABLE sales.orders (
    region,
    status,
    sales_channel
)
WITH (cardinality_cap = 100000);
Each indexed column gets an independent cache entry. The cardinality cap protects against accidentally indexing a near-unique column.

Queries served from memory

The fast path covers the single-column query shapes Power BI uses to populate slicers and inspect column ranges:

SELECT DISTINCT region
FROM sales.orders;

SELECT status, COUNT(*)
FROM sales.orders
GROUP BY status;

SELECT MIN(order_date), MAX(order_date)
FROM sales.orders;
After the first load, the cached Arrow batch is reused without reading the parent Parquet files.

What stays on the normal engine path

A DIM index is intentionally narrow. Measures, joins, multi-column groups, and filters that reference another column continue through the regular SQL engine:

SELECT region, SUM(total_amount)
FROM sales.orders
WHERE status = 'complete'
GROUP BY region;
This query references three columns, so it is not answered from the DIM cache.

Confirm the cache is used

EXPLAIN
SELECT DISTINCT region
FROM sales.orders;
A matching query shows a DimCacheServe plan with the table, index, column, query shape, and versions.

Rebuild after table changes

The DIM index is version-bound. If the parent snapshot changes and the index is no longer current, DeltaForge falls back to the normal query path rather than returning stale data.

REBUILD INDEX idx_pbi_slicers
ON TABLE sales.orders;
Schedule a rebuild after data loads to refresh values and restore exact per-value counts.

Choose columns carefully

  • Good: region, category, status, channel, segment, and other slicer columns.
  • Avoid: order IDs, transaction IDs, timestamps, and other near-unique columns.
  • Partition columns may already use a metadata-only fast path without a DIM index.

The result

Power BI still queries the live Delta table through DirectQuery. The repetitive dimension-envelope requests are served from engine memory, while analytical measures continue to use the full query engine against the current table snapshot.