Skip to content

Compute that scales with demand

DeltaForge adds capacity when queries arrive and releases it when they stop. Scale all the way to zero when idle, keep a warm baseline for instant response, and pay only for the compute that runs.

Adds and removes capacity automatically
Scale to zero when idle, warm baseline when you want it
Runs in your own cloud, no clusters to babysit
Query Demand Autoscaler watches load, sets capacity scale Worker Pool worker · running worker · running worker · starting worker · draining scale up on demand scale down when idle min 0 to max N · scale to zero

How autoscaling works

From idle to peak and back, capacity follows the work in real time

Watches demand

The load signal is in-flight queries, not open sessions, so an idle BI connection never holds capacity. DeltaForge reads what is actually running on each worker and reevaluates within seconds. Nothing depends on you watching a dashboard.

Adds capacity on demand

When load rises, or a query arrives with nowhere to run, fresh workers start automatically up to the ceiling you set. New capacity joins the pool and starts taking work as soon as it is ready.

Queries wait only for startup

A query that arrives to a full or empty pool is held for the short moment it takes a worker to come online, then runs. No failed request, no manual retry, no babysitting.

Releases idle capacity

When demand drops, extra workers finish their in-flight queries and shut down cleanly, down to the floor you set. In-flight work is never interrupted.

Cost and performance, under your control

No idle overhead, no surprises, full visibility

Scale to zero

Set the floor to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while nothing is running.

Warm baseline

Keep one or more workers always ready so the first query of the day is instant. You choose the balance between cost when idle and latency on the first request.

Your limits, your policy

Define the minimum to keep warm, the maximum ceiling that caps cost, and how eagerly capacity is added. Changes take effect live, without a restart.

Full visibility

See current versus target capacity, how much demand pressure there is, and every scale action with its reason, right from the console.

Runs in your cloud

Capacity follows the work on the infrastructure you already run

Your infrastructure

Autoscaling drives compute in your own cloud account. DeltaForge stays in charge of when and how much to scale; your data never leaves your environment.

Works with container platforms

Workers are stateless and quick to start, so they fit Kubernetes and the Azure, AWS, or GCP container services you already run. The same scaling policy applies wherever your workers live.

No clusters to manage

There are no clusters to size or warm up. Capacity is created and removed as the work demands it, so there is nothing to provision ahead of time.

Off until you want it

Autoscaling is opt-in. Start in a safe observe mode to see exactly what it would do against real traffic, then turn it on when you are ready.

Frequently Asked Questions

Quick answers on scaling lakehouse compute on demand

How does a lakehouse autoscale compute?

DeltaForge measures in-flight queries, not open sessions, so an idle BI connection never holds capacity. When queries arrive with nowhere to run, stateless workers start automatically up to the ceiling you set; when work stops, workers finish their in-flight queries and shut down cleanly, down to zero if you allow it.

Can compute scale to zero?

Yes. Set the minimum to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while idle. The first query afterward waits briefly for a worker to come online, then runs.

Will the first query after an idle period be slow?

Only by the short time it takes a worker to start. If you prefer no startup delay on the common path, keep a warm baseline of one or more workers always ready; you choose the cost-versus-latency tradeoff.

Do I have to manage nodes or clusters?

No. There are no clusters to size or warm up. Autoscaling is opt-in and can run in a safe observe mode first, so you can see what it would do before turning it on. Each worker runs the same Rust compute engine, coordinated by the control plane.

Further reading

Hands-on guides for the workloads your workers will be running

OPTIMIZE, VACUUM and Z-ORDER Without Spark: The Delta Maintenance Runbook

Maintenance jobs are exactly the kind of bursty work autoscaling absorbs: capacity rises for the run, then drains back down.

MERGE, UPDATE and DELETE on Delta Lake Without Spark

Run full DML on Delta tables from plain SQL, on stateless workers instead of a long-lived cluster.

Pay for compute that works, not compute that waits

Let capacity follow demand, from zero to peak and back, on your own cloud.