What is compute autoscaling in DeltaForge?

DeltaForge watches query demand and how busy each worker is, then adds or removes compute workers automatically to match. Capacity rises when load increases and falls when work stops, with no manual node management.

Where does autoscaling run?

It drives compute in your own cloud account, on the platform you already run DeltaForge on. It works with Azure, AWS, and GCP container services, or any platform you point it at.

Lakehouse Compute Autoscaling: Stateless Workers

How autoscaling works

From idle to peak and back, capacity follows the work in real time

Watches demand

The load signal is in-flight queries, not open sessions, so an idle BI connection never holds capacity. DeltaForge reads what is actually running on each worker and reevaluates within seconds. Nothing depends on you watching a dashboard.

Adds capacity on demand

When load rises, or a query arrives with nowhere to run, fresh workers start automatically up to the ceiling you set. New capacity joins the pool and starts taking work as soon as it is ready.

Queries wait only for startup

A query that arrives to a full or empty pool is held for the short moment it takes a worker to come online, then runs. No failed request, no manual retry, no babysitting.

Releases idle capacity

When demand drops, extra workers finish their in-flight queries and shut down cleanly, down to the floor you set. In-flight work is never interrupted.

Cost and performance, under your control

No idle overhead, no surprises, full visibility

Scale to zero

Set the floor to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while nothing is running.

Warm baseline

Keep one or more workers always ready so the first query of the day is instant. You choose the balance between cost when idle and latency on the first request.

Your limits, your policy

Define the minimum to keep warm, the maximum ceiling that caps cost, and how eagerly capacity is added. Changes take effect live, without a restart.

Full visibility

See current versus target capacity, how much demand pressure there is, and every scale action with its reason, right from the console.

Runs in your cloud

Capacity follows the work on the infrastructure you already run

Your infrastructure

Autoscaling drives compute in your own cloud account. DeltaForge stays in charge of when and how much to scale; your data never leaves your environment.

Works with container platforms

Workers are stateless and quick to start, so they fit Kubernetes and the Azure, AWS, or GCP container services you already run. The same scaling policy applies wherever your workers live.

No clusters to manage

There are no clusters to size or warm up. Capacity is created and removed as the work demands it, so there is nothing to provision ahead of time.

Off until you want it

Autoscaling is opt-in. Start in a safe observe mode to see exactly what it would do against real traffic, then turn it on when you are ready.

Frequently Asked Questions

Quick answers on scaling lakehouse compute on demand

How does a lakehouse autoscale compute?

DeltaForge measures in-flight queries, not open sessions, so an idle BI connection never holds capacity. When queries arrive with nowhere to run, stateless workers start automatically up to the ceiling you set; when work stops, workers finish their in-flight queries and shut down cleanly, down to zero if you allow it.

Can compute scale to zero?

Yes. Set the minimum to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while idle. The first query afterward waits briefly for a worker to come online, then runs.

Will the first query after an idle period be slow?

Only by the short time it takes a worker to start. If you prefer no startup delay on the common path, keep a warm baseline of one or more workers always ready; you choose the cost-versus-latency tradeoff.

Do I have to manage nodes or clusters?

No. There are no clusters to size or warm up. Autoscaling is opt-in and can run in a safe observe mode first, so you can see what it would do before turning it on. Each worker runs the same Rust compute engine, coordinated by the control plane.

Compute that scales with demand