Single Node Query Engine for Delta Lake

How a query executes

From SQL text to result set, one coherent native pipeline

Parse and plan

SQL is parsed to an AST, type-checked, and optimized: predicate pushdown, partition pruning, projection pruning, join reordering based on collected statistics.

Morsel scheduling

The physical plan is broken into morsels (chunks of partitions). A task scheduler assigns morsels to worker threads, balancing load across cores throughout the query.

Vectorized operators

Each operator (scan, filter, hash join, hash aggregation, sort) processes a full columnar batch per call. Column-at-a-time evaluation keeps CPU caches hot and enables SIMD-friendly memory layouts.

Spill-to-disk

Sort and hash aggregation spill to disk when data exceeds available memory. Spill is compressed and async so other operators continue in parallel.

Query optimization

Delta Lake-aware cost-based optimizer with file-level statistics

Predicate pushdown

Filters on partition columns eliminate entire directory scans. File-level min/max statistics skip Parquet files that cannot contain matching rows.

Join reordering

The optimizer picks join order based on estimated cardinality from table statistics, reducing intermediate result size.

Semi/anti joins

EXISTS and NOT EXISTS subqueries are converted to semi-joins and anti-joins with early termination, avoiding unnecessary full-table evaluation.

Expression folding

Constant expressions are folded at plan time. Dead predicates (1=1, WHERE false) are eliminated before any data is read.

One binary from laptop to cluster

Single-worker by default; explicit opt-in for workloads that need cross-worker parallelism

Default: single worker

Every query runs on one worker. That worker uses all its CPU cores via the morsel scheduler. Most analytical workloads fit comfortably inside a single large VM.

Opt-in distribution

For workloads where a single node is a genuine bottleneck, you can opt a query into distributed execution across multiple workers. Workers are stateless: each one processes its assigned partitions independently and results are gathered by a coordinator phase, so adding or removing a worker never requires data rebalancing. Compute autoscaling grows and shrinks the worker pool to match demand.

Frequently asked questions

Quick answers about single-node execution and when to scale out

Is there a single node query engine for Delta Lake?

Yes. DeltaForge executes Delta Lake queries inside a native single-node engine: Arrow columnar batches flow through a vectorized, morsel-scheduled operator pipeline that uses every CPU core. There is no JVM and no cluster to provision, and the same binary scales out to multiple workers when you opt a query into distributed execution.

What is morsel-driven scheduling?

Morsel-driven scheduling breaks the physical plan into small chunks of partitions called morsels and hands them to a pool of worker threads. The scheduler keeps load balanced across cores throughout the query, so a slow file or skewed partition does not stall the rest of the pipeline.

Why does DeltaForge default to a single worker per query?

Most analytical workloads fit comfortably inside one large VM. A single worker that uses every core via the morsel scheduler removes coordinator chatter, cross-node shuffles, and cluster spin-up. Distributed execution is offered as opt-in.

How do filters and joins benefit from Delta Lake statistics?

Partition predicates eliminate entire directory scans. File-level min/max statistics from the Delta transaction log skip Parquet files that cannot contain matching rows. Join order is picked from estimated cardinality.

Does the engine spill to disk when memory runs out?

Yes. Sort and hash aggregation spill to disk when data exceeds available memory. Spill is compressed and async, so other operators in the pipeline keep running in parallel.

When should I opt a query into distributed execution?

Opt in when a single large VM is provably the bottleneck: scan volumes that exceed a single node IO ceiling, or joins whose intermediate results exceed local memory after spill.

A single-node query engine for Delta Lake