Write SQL, commit to Git, and Delta Forge automatically discovers your pipelines, extracts lineage, computes execution order, and schedules runs. No DAGs. No YAML. No orchestration code.
Your pipeline code lives in Git. Delta Forge does the rest.
.sql files with PIPELINE and SCHEDULE declarationsSchedules, pipelines, and queries — all expressed in pure SQL
-- Schedule: when and how to run
SCHEDULE daily_etl
CRON '0 6 * * *'
TIMEZONE 'America/New_York'
TARGET_NODES ALL
DESCRIPTION 'Daily morning ETL batch'
RETRIES 0
RETRY_DELAY 60
TIMEOUT 3600
MAX_CONCURRENT 1
PRIORITY 10
CATCHUP false
NOTIFY 'team@example.com'
WEBHOOK 'https://hooks.slack.com/...'
ACTIVE
;
-- Pipeline: what to run
PIPELINE my_etl_pipeline
DESCRIPTION 'Daily ETL pipeline for customer data'
SCHEDULE 'daily_etl'
TAGS 'etl', 'customers', 'daily'
SLA 4.0
FAIL_FAST true
DEFAULTS ($run_date = '2024-01-01')
STATUS ACTIVE
;
-- The actual SQL that runs
CREATE DELTA TABLE IF NOT EXISTS bronze.sales
LOCATION '/lake/bronze/sales'
AS SELECT *
FROM raw.csv.sales
WHERE sale_date = $run_date
ORDER BY id;
Cron expressions, timezone, retries, timeout, concurrency limits, email notifications, Slack webhooks, and priority — all in one declaration.
Description, schedule reference, SLA targets, tags, fail-fast mode, and parameterized defaults. The pipeline points to its schedule by name.
No special syntax after the declarations. Write the SQL you already know — CREATE TABLE, INSERT, MERGE, any standard SQL statement.
Variables like $run_date are declared with defaults in the PIPELINE block and can be overridden at execution time.
Delta Forge computes the DAG from your SQL — you never define it
Parses every SQL statement to find which tables each pipeline reads and which tables it writes. No annotations needed.
Builds a directed acyclic graph automatically from the read/write relationships across all pipelines in the workspace.
Uses topological sort to compute execution layers. Pipelines in the same layer have no mutual dependencies and run concurrently.
Detects circular dependencies between pipelines and surfaces warnings before execution, preventing infinite loops or deadlocks.
Example: Three pipelines, automatically ordered
A fundamentally different approach to data pipeline orchestration
No Python DAG code to maintain. The execution graph is computed automatically from the SQL itself — read/write analysis replaces manual dependency wiring.
No manifest files, no configuration drift. The pipeline IS the SQL file. One artifact, one source of truth, zero synchronization overhead.
Code review, versioning, branching, and rollback are built in via Git. No screenshots of drag-and-drop canvases in pull requests.
No vendor lock-in. Your pipelines are portable SQL files that live in your Git repository. Move between platforms without rewriting orchestration logic.
Enterprise-grade features, declared in SQL
Add --APPROVAL REQUIRED to enforce review gates before production execution. Approval clears on source change.
Declare SLA targets in hours. Delta Forge tracks execution time and alerts when pipelines exceed their SLA window.
Built-in NOTIFY and WEBHOOK fields on schedules. Get alerts on success, failure, or SLA breach without external tooling.
DEFAULTS block declares variables with fallback values. Override at runtime for backfills or ad-hoc executions.
Shared SQL modules that multiple pipelines can include. Write common logic once, reference it everywhere.
STATUS field supports DRAFT and ACTIVE. Develop and test pipelines in draft mode before enabling scheduled execution.
FAIL_FAST true stops the pipeline on the first statement error. Set to false to continue executing remaining statements.
MAX_CONCURRENT controls how many instances of a schedule can run at once. Prevent overlapping runs and resource contention.
A complete development lifecycle for SQL pipelines — built around the tools your team already knows
Write SQL pipelines in a full-featured editor with IntelliSense, catalog browsing, and SSMS-style inline execution.
Every pipeline is a SQL file stored in a Git repository. Each workspace has its own repo with full branching support.
pipelines/<name>.sqlPromote pipelines through approval gates and schedule them for automatic execution on compute workers.
Workspaces organise pipelines, permissions, and source control into a single governed unit
A workspace is the unit of collaboration. It groups pipelines with their Git repository, team permissions, and execution schedules.
All pipelines in a workspace share a single Git repo. Branch, merge, and review changes as a team.
Owners, editors, and viewers. RBAC governs who can edit pipelines, approve for production, or view results.
Each pipeline has its own cron schedule, compute node assignment, and approval gate for production.
Mark pipelines as requiring approval before scheduled execution. Approval clears automatically when source code changes.
All Git operations are available directly from the pipeline editor toolbar — no terminal needed
Save your pipeline changes to the catalog, then commit and push to the remote Git repository — all from the editor toolbar dropdown.
Create feature branches, switch between branches, and merge changes. The toolbar shows your current branch and sync status at a glance.
Pull the latest changes from remote before starting work. Branch status indicators show clean (green), modified (yellow), or conflict (red).
View diffs of your changes before committing. Full commit history is available for auditing and understanding how pipelines evolved over time.
Create pull requests directly from the editor for team code review. Ensure pipeline changes are reviewed before they reach production.
Every scheduled execution records the Git commit SHA. Trace any production result back to the exact version of SQL that produced it.
Understand how data flows through your pipelines — built in, not bolted on
See which tables feed which downstream tables. Lineage maps the full data flow across your pipeline from source tables to final outputs.
Lineage is derived from your pipeline SQL — zero configuration, zero manual annotation. Write your SQL and the lineage graph appears.
Not an add-on or third-party integration. Data lineage is built into the Delta Forge platform from day one, available on every plan.
Visualize upstream and downstream dependency graphs directly in the editor. Understand at a glance how data moves through your pipeline.
Tracks data flow across multi-statement SQL pipelines. Temporary tables, CTEs, and intermediate results are followed across statement boundaries.
Understand what breaks when a source table changes. Trace downstream dependencies to assess the blast radius of schema changes before they happen.
Git-native pipelines, automatic execution order, and zero orchestration code.