Skip to content
Developers

Build on your lakehouse,
with tools you already use

DeltaForge is a single binary that runs native SQL on Delta Lake and Apache Iceberg. No JVM, no Spark, no cluster. Connect with standard ODBC, JDBC, Python, R, .NET, or wire it into Claude and Cursor through the built-in MCP server.

VS Code extension, desktop app, SQL CLI, and MCP server
Standard ODBC, JDBC, Python, R, and .NET drivers
Single binary on your laptop, same engine in production
VS Code Extension PIPELINE sales_daily SCHEDULE '0 6 * * *' Executed in 142ms 4,892 rows Desktop App Workspace: analytics sales_daily active customer_360 active 2 compute nodes DeltaForge Control Plane Compute Node Catalog + Scheduler SQL CLI $ delta-forge-cli $ delta-forge-cli run job.sql $ delta-forge-cli -f json MCP Server delta-forge-mcp install claude list_tables / validate_syntax extract_lineage / explain_query

The dev stack

Three building blocks that ship in one binary

SQL engine

Native SQL on Delta Lake and Apache Iceberg. PostgreSQL-flavored grammar plus lakehouse commands: MERGE, time travel, deletion vectors, change data feed, UniForm interop, PIPELINE, VACUUM, OPTIMIZE.

Native property graph

Project your Delta tables into a property graph in place. Cypher plus 32 algorithms (PageRank, Leiden, Bellman-Ford, FastRP embeddings, K-core, Yen's K-SP...) with 18 of them GPU-accelerated. Join graph results back to SQL in the same session.

MCP server for AI tools

Claude, Cursor, and Copilot get live access to your catalog, schemas, and SQL surface through the built-in Model Context Protocol server. No bespoke retrieval layer to maintain.

Quickstart

From zero to a real query
in two commands

No registration to download. No JVM to install. No cluster to provision. The Community license activates from a free account when you first launch.

Install (Linux, macOS)

# Headless engine + CLI for scripts and CI
curl -fsSL https://deltaforge.org/install.sh | sh -s -- --pkg deltaforge-cli

First query

CREATE TABLE sales USING DELTA AS
SELECT id, amount, ts
FROM   read_parquet('s3://bucket/raw/*.parquet');

MERGE INTO sales t
USING      updates u ON t.id = u.id
WHEN MATCHED THEN UPDATE SET amount = u.amount
WHEN NOT MATCHED THEN INSERT *;

SELECT SUM(amount) FROM sales VERSION AS OF 3;

The PIPELINE command

A first-class SQL command parsed by the engine. Schedule, reliability settings, and metadata live in the same file as the SQL logic.

PIPELINE sales_daily_refresh
  SCHEDULE  '0 6 * * *'
  TIMEZONE  'America/New_York'
  OWNER     'data-team'
  TIMEOUT   '30m'
  RETRIES   3;

INSERT INTO gold.revenue
SELECT  product_id, SUM(amount) AS revenue
FROM    curated.sales
WHERE   sale_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY product_id;

Lineage from the code itself

DeltaForge reads the SQL and derives which tables each pipeline reads from and writes to. You do not declare lineage. Execution order across pipelines is calculated from those dependencies automatically.

Readable git diffs

Pull requests show SQL diffs. Reviewers see exactly what changed: a schedule, a filter condition, a column added to a SELECT. No opaque JSON config blobs to decode.

VS Code extension

Write, execute, and monitor pipelines without leaving your editor.

Pipeline authoring

Autocomplete for PIPELINE directives and schema-aware IntelliSense for table and column references drawn from the live catalog. No context-switching to look up column names or cron syntax.

Live query execution

Execute queries against a running compute node. Results in a paginated grid with full type fidelity. Auto-discovers healthy nodes, fails over if one is unreachable.

Sidebar panels

Connection health, data catalog hierarchy, pipeline status grouped by state, SQL and pipeline snippet templates, and query history with execution time and row counts.

Desktop app

Configuration and management hub. Set up workspaces, connect git repositories, and explore the catalog.

Workspace setup

Connect a remote git repository as a DeltaForge workspace. Credentials stored in the OS keychain. One-click "Open in VS Code" writes the connection config and launches your editor.

Catalog explorer

Browse schemas, tables, and columns in a hierarchical tree. View column types, nullability, and partition info. Table version history with time travel. Quick actions: preview data, copy name, show DDL.

Compute and connections

Register and monitor compute nodes. Configure data source connections with credentials stored in OS Keychain or Azure Key Vault, distributed to compute nodes at execution time.

SQL CLI

Ad-hoc queries, script execution, and CI/CD automation from the terminal.

# Interactive REPL
delta-forge-cli --profile production

# Execute a script with variable substitution
delta-forge-cli run migrate.sql -D env=prod -D cutoff=2024-01-15

# One-shot query, JSON output
delta-forge-cli --format json query "SELECT count(*) FROM sales.orders"

# CI/CD: pipe from stdin
cat setup.sql | delta-forge-cli --force

What you can build

Four shapes of application that fit the stack

Data pipelines

Declarative PIPELINE definitions, idempotent MERGE upserts, change data feed for incremental ETL, and time-travel reads for backfills. Run them locally, schedule them in prod, audit them with SQL.

BI and embedded analytics

Point Power BI, Tableau, or Excel at the lake directly through ODBC. Embed query results into product surfaces via the same driver. No copy step into a second warehouse.

AI agents on your data

Expose tables, views, and the SQL surface to Claude or Cursor through the MCP server. Build retrieval, summarization, and write-back agents against the same engine your dashboards use.

Graph-driven features

Fraud rings, recommendations, supply-chain traversal, and lineage analysis on the tables you already have. Cypher and PageRank on Delta, joined back to SQL in the same session.

Resources for building

Reference, examples, and the code itself

Documentation

SQL reference, configuration, deployment, MCP setup, and connector guides at docs.deltaforge.org.

Engine on GitHub

Apache 2.0 source. Track releases, file issues, read PRs at deltaforge-org/delta-forge.

Demos and use cases

258 end-to-end SQL demos with 10,500+ machine-checked assertions. Run them locally to learn the grammar by example. Browse demos.

Benchmark and conformance repos

Public harness for TPC-H, TPC-DS, SSB, JOB plus 7,137 bi-directional Spark scenarios. Reproduce, audit, or extend on your own hardware.

Run it on your laptop in minutes

Free Community license. Single binary. No JVM, no cluster, no card required.