DeltaForge adds capacity when queries arrive and releases it when they stop. Scale all the way to zero when idle, keep a warm baseline for instant response, and pay only for the compute that runs.
From idle to peak and back, capacity follows the work in real time
The load signal is in-flight queries, not open sessions, so an idle BI connection never holds capacity. DeltaForge reads what is actually running on each worker and reevaluates within seconds. Nothing depends on you watching a dashboard.
When load rises, or a query arrives with nowhere to run, fresh workers start automatically up to the ceiling you set. New capacity joins the pool and starts taking work as soon as it is ready.
A query that arrives to a full or empty pool is held for the short moment it takes a worker to come online, then runs. No failed request, no manual retry, no babysitting.
When demand drops, extra workers finish their in-flight queries and shut down cleanly, down to the floor you set. In-flight work is never interrupted.
No idle overhead, no surprises, full visibility
Set the floor to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while nothing is running.
Keep one or more workers always ready so the first query of the day is instant. You choose the balance between cost when idle and latency on the first request.
Define the minimum to keep warm, the maximum ceiling that caps cost, and how eagerly capacity is added. Changes take effect live, without a restart.
See current versus target capacity, how much demand pressure there is, and every scale action with its reason, right from the console.
Capacity follows the work on the infrastructure you already run
Autoscaling drives compute in your own cloud account. DeltaForge stays in charge of when and how much to scale; your data never leaves your environment.
Workers are stateless and quick to start, so they fit Kubernetes and the Azure, AWS, or GCP container services you already run. The same scaling policy applies wherever your workers live.
There are no clusters to size or warm up. Capacity is created and removed as the work demands it, so there is nothing to provision ahead of time.
Autoscaling is opt-in. Start in a safe observe mode to see exactly what it would do against real traffic, then turn it on when you are ready.
Quick answers on scaling lakehouse compute on demand
DeltaForge measures in-flight queries, not open sessions, so an idle BI connection never holds capacity. When queries arrive with nowhere to run, stateless workers start automatically up to the ceiling you set; when work stops, workers finish their in-flight queries and shut down cleanly, down to zero if you allow it.
Yes. Set the minimum to zero and DeltaForge releases all workers during quiet periods, so you pay nothing for compute while idle. The first query afterward waits briefly for a worker to come online, then runs.
Only by the short time it takes a worker to start. If you prefer no startup delay on the common path, keep a warm baseline of one or more workers always ready; you choose the cost-versus-latency tradeoff.
No. There are no clusters to size or warm up. Autoscaling is opt-in and can run in a safe observe mode first, so you can see what it would do before turning it on. Each worker runs the same Rust compute engine, coordinated by the control plane.
Hands-on guides for the workloads your workers will be running
Maintenance jobs are exactly the kind of bursty work autoscaling absorbs: capacity rises for the run, then drains back down.
Run full DML on Delta tables from plain SQL, on stateless workers instead of a long-lived cluster.
Let capacity follow demand, from zero to peak and back, on your own cloud.