Which languages and drivers does DeltaForge support?

DeltaForge ships standard ODBC, JDBC, ADBC, Python, R, and .NET drivers. Any tool that speaks those protocols connects to Delta Lake and Apache Iceberg tables directly.

Can AI assistants query my lakehouse through DeltaForge?

Yes. The bundled MCP server exposes the catalog, pipeline graph, and SQL surface to MCP-compatible assistants such as Claude, Cursor, and Copilot.

Is the laptop experience the same as production?

Yes. DeltaForge is a single binary on your laptop and the same engine in production. There is no separate runtime to install and no cluster to provision before you write your first query.

Is there a VS Code integration?

Yes. The DeltaForge VS Code extension adds a catalog browser, a pipeline panel, snippet templates, and a query result panel.

Build on your lakehouse,
with tools you already use

DeltaForge is a single binary that runs native SQL on Delta Lake and Apache Iceberg. No JVM, no Spark, no cluster. Connect with standard ODBC, JDBC, Python, R, .NET, or wire it into Claude and Cursor through the built-in MCP server.

VS Code extension, desktop app, SQL CLI, and MCP server

Standard ODBC, JDBC, Python, R, and .NET drivers

Single binary on your laptop, same engine in production

The dev stack

Three building blocks that ship in one binary

SQL engine

Native SQL on Delta Lake and Apache Iceberg. PostgreSQL-flavored grammar plus lakehouse commands: MERGE, time travel, deletion vectors, change data feed, UniForm interop, PIPELINE, VACUUM, OPTIMIZE.

Native property graph

Project your Delta tables into a property graph in place. Cypher plus 32 algorithms (PageRank, Leiden, Bellman-Ford, FastRP embeddings, K-core, Yen's K-SP...) with 18 of them GPU-accelerated. Join graph results back to SQL in the same session.

MCP server for AI tools

Claude, Cursor, and Copilot get live access to your catalog, schemas, and SQL surface through the built-in Model Context Protocol server. No bespoke retrieval layer to maintain.

From zero to a real query
in two commands

No registration to download. No JVM to install. No cluster to provision. The official Delta Lake quickstart begins with installing Spark; this path skips that entirely. The Community license activates from a free account when you first launch.

Install (Linux, macOS)

# Headless engine + CLI for scripts and CI
curl -fsSL https://deltaforge.org/install.sh | sh -s -- --pkg deltaforge-cli

First query

CREATE TABLE sales USING DELTA AS
SELECT id, amount, ts
FROM   read_parquet('s3://bucket/raw/*.parquet');

MERGE INTO sales t
USING      updates u ON t.id = u.id
WHEN MATCHED THEN UPDATE SET amount = u.amount
WHEN NOT MATCHED THEN INSERT *;

SELECT SUM(amount) FROM sales VERSION AS OF 3;

All install options Read the docs

The PIPELINE command

A first-class SQL command parsed by the engine. Schedule, reliability settings, and metadata live in the same file as the SQL logic.

PIPELINE sales_daily_refresh
  SCHEDULE  '0 6 * * *'
  TIMEZONE  'America/New_York'
  OWNER     'data-team'
  TIMEOUT   '30m'
  RETRIES   3;

INSERT INTO gold.revenue
SELECT  product_id, SUM(amount) AS revenue
FROM    curated.sales
WHERE   sale_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY product_id;

Lineage from the code itself

DeltaForge reads the SQL and derives which tables each pipeline reads from and writes to. You do not declare lineage. Execution order across pipelines is calculated from those dependencies automatically.

Readable git diffs

Pull requests show SQL diffs. Reviewers see exactly what changed: a schedule, a filter condition, a column added to a SELECT. No opaque JSON config blobs to decode.

VS Code extension

Write, execute, and monitor pipelines without leaving your editor.

Pipeline authoring

Autocomplete for PIPELINE directives and schema-aware IntelliSense for table and column references drawn from the live catalog. No context-switching to look up column names or cron syntax.

Live query execution

Execute queries against a running compute node. Results in a paginated grid with full type fidelity. Auto-discovers healthy nodes, fails over if one is unreachable.

Sidebar panels

Connection health, data catalog hierarchy, pipeline status grouped by state, SQL and pipeline snippet templates, and query history with execution time and row counts.

Desktop app

Configuration and management hub. Set up workspaces, connect git repositories, and explore the catalog.

Workspace setup

Connect a remote git repository as a DeltaForge workspace. Credentials stored in the OS keychain. One-click "Open in VS Code" writes the connection config and launches your editor.

Catalog explorer

Browse schemas, tables, and columns in a hierarchical tree. View column types, nullability, and partition info. Table version history with time travel. Quick actions: preview data, copy name, show DDL.

Compute and connections

Register and monitor compute nodes. Configure data source connections with credentials stored in OS Keychain or Azure Key Vault, distributed to compute nodes at execution time.

SQL CLI

Ad-hoc queries, script execution, and CI/CD automation from the terminal.

# Interactive REPL
delta-forge-cli --profile production

# Execute a script with variable substitution
delta-forge-cli run migrate.sql -D env=prod -D cutoff=2024-01-15

# One-shot query, JSON output
delta-forge-cli --format json query "SELECT count(*) FROM sales.orders"

# CI/CD: pipe from stdin
cat setup.sql | delta-forge-cli --force

What you can build

Four shapes of application that fit the stack

Data pipelines

Declarative PIPELINE definitions, idempotent MERGE upserts, change data feed for incremental ETL, and time-travel reads for backfills. Run them locally, schedule them in prod, audit them with SQL.

BI and embedded analytics

Point Power BI, Tableau, Excel, or DBeaver at the lake directly through ODBC. Embed query results into product surfaces via the same driver. No copy step into a second warehouse.

AI agents on your data

Expose tables, views, and the SQL surface to Claude or Cursor through the MCP server. Build retrieval, summarization, and write-back agents against the same engine your dashboards use.

Graph-driven features

Fraud rings, recommendations, supply-chain traversal, and lineage analysis on the tables you already have. Cypher and PageRank on Delta, joined back to SQL in the same session.

Resources for building

Reference, examples, and the code itself

Documentation

SQL reference, configuration, deployment, MCP setup, and connector guides at docs.deltaforge.org.

Engine on GitHub

Apache 2.0 source. Track releases, file issues, read PRs at deltaforge-org/delta-forge.

Demos and use cases

258 end-to-end SQL demos with 10,500+ machine-checked assertions. Run them locally to learn the grammar by example. Browse demos.

Benchmark and conformance repos

Public harness for TPC-H, TPC-DS, SSB, JOB plus 7,137 bi-directional Spark scenarios. Reproduce, audit, or extend on your own hardware.

Build on your lakehouse,
with tools you already use

The dev stack

SQL engine

Native property graph

MCP server for AI tools

From zero to a real query
in two commands

Install (Linux, macOS)

First query

The PIPELINE command

Lineage from the code itself

Readable git diffs

VS Code extension

Pipeline authoring

Live query execution

Sidebar panels

Desktop app

Workspace setup

Catalog explorer

Compute and connections

SQL CLI

What you can build

Data pipelines

BI and embedded analytics

AI agents on your data

Graph-driven features

Resources for building

Documentation

Engine on GitHub

Demos and use cases

Benchmark and conformance repos

Further reading

MERGE, UPDATE, and DELETE on Delta Lake without Spark

CSV to Delta Lake without Spark

Run Cypher on Parquet and Delta tables without Neo4j

Run it on your laptop in minutes

Build on your lakehouse,with tools you already use

The dev stack

SQL engine

Native property graph

MCP server for AI tools

From zero to a real queryin two commands

Install (Linux, macOS)

First query

The PIPELINE command

Lineage from the code itself

Readable git diffs

VS Code extension

Pipeline authoring

Live query execution

Sidebar panels

Desktop app

Workspace setup

Catalog explorer

Compute and connections

SQL CLI

What you can build

Data pipelines

BI and embedded analytics

AI agents on your data

Graph-driven features

Resources for building

Documentation

Engine on GitHub

Demos and use cases

Benchmark and conformance repos

Further reading

MERGE, UPDATE, and DELETE on Delta Lake without Spark

CSV to Delta Lake without Spark

Run Cypher on Parquet and Delta tables without Neo4j

Run it on your laptop in minutes

Build on your lakehouse,
with tools you already use

From zero to a real query
in two commands