Delta Lake is an open table format, not a Spark feature. Any engine that writes Parquet and commits to the transaction log can run the full set of operations. This guide collects the ones DeltaForge runs as plain SQL, each with a tutorial you can reproduce.
A Delta Lake table is a directory of ordinary Parquet data files plus a _delta_log
directory of JSON commits that record which files belong to each version. The protocol is an open
specification. Spark is its reference implementation, but it is not a dependency: any engine that can
write Parquet and append correct commits to the log can read, write, and evolve the table.
DeltaForge is a commercial engine you install on your own cloud VMs, on-premises servers, or air-gapped environments. It implements the Delta write protocol directly and exposes it as PostgreSQL-flavored SQL, so the operations below are statements you already know, with no JVM cluster, no notebook, and no managed service reading your tables. Because the output is standard Delta Lake, the tables stay readable by Spark, Databricks, DuckDB, Trino, and delta-rs afterward.
Each links to a runnable, asserted tutorial.
Copy-on-write internals and a three-way MERGE upsert with WHEN NOT MATCHED BY SOURCE.
Read the tutorialTurn a raw CSV into a managed Delta table in two SQL statements, no ingestion framework.
Read the tutorialRead row-level inserts, updates, and deletes between versions with table_changes().
Read the tutorialMaintain slowly changing dimensions with effective dates using a single MERGE.
Read the tutorialQuery an earlier table version, and fix the common "cannot time travel to version" error.
Read the tutorialHard-delete a subject's rows and reclaim the underlying files so the data is truly gone.
Read the tutorialCompact small files, reclaim space, and cluster data for faster reads, all in SQL.
Read the tutorialYes. Delta Lake is an open table format: a directory of Parquet data files plus a transaction log of JSON commits. Any engine that can write Parquet and append valid commits to the log can operate on the table. Spark is the reference implementation, not a requirement. DeltaForge, a commercial engine you install on your own infrastructure, implements the operations as plain SQL.
Inserts, updates, deletes, and full MERGE upserts; change data feed; time travel; SCD Type 2 history; GDPR deletes; and table maintenance with OPTIMIZE, VACUUM, and Z-ORDER. Each is a standard SQL statement, with no JVM cluster and no notebook.
Yes. DeltaForge writes the standard Delta transaction protocol, so the resulting tables stay readable by Spark, Databricks, DuckDB, Trino, and delta-rs. There is no proprietary on-disk format and no lock-in to the engine.
Yes. Because the format is the open Delta Lake protocol, DeltaForge reads and writes the same tables a Databricks or Spark job produces, and the reverse holds too. You can operate on one table with more than one engine.
Install DeltaForge and reproduce every statement above against your own Delta tables.