Skip to content
Table Format

Native Delta Lake reader and writer

DeltaForge reads and writes Delta Lake tables directly: ACID transactions, deletion vectors, change data feed, time travel via Delta versions, and schema evolution, all without a Spark cluster.

ACID transactions with optimistic concurrency
Deletion vectors for surgical row deletes
UniForm: Delta tables readable as Iceberg
Transaction Log (_delta_log/) 000.json 001.json 002.json checkpoint.parquet governs Parquet Data Files part-0001.parquet 128 MB part-0002.parquet 256 MB part-0003.parquet 192 MB part-0004.parquet 210 MB Deletion Vectors compact bitmaps, no file rewrite bitmap {1,3,6} Column Statistics min/max, null count, histograms used by optimizer to skip files ACID TIME TRAVEL SCHEMA EVOLUTION CDF

Core Delta capabilities

The features most production workloads depend on, implemented natively

ACID transactions

Every write commits atomically via the transaction log. Concurrent writers use optimistic concurrency: conflicts are detected at commit time, not upfront. Readers never see partial writes.

Deletion vectors

DELETE and MERGE record deleted row positions in compact bitmaps rather than rewriting Parquet files. Readers apply the bitmap during scan. VACUUM permanently removes files when retention allows.

Time travel via Delta versions

Query any committed version by number or timestamp. Useful for audit reconstruction, incremental pipelines, and reverting accidental changes.

SELECT * FROM events VERSION AS OF 42

Change Data Feed

Track every row-level change (insert, update pre/post image, delete) across a version or timestamp range. Useful for incremental ETL, cache invalidation, and audit trails.

SELECT * FROM table_changes('customers', 100, 150)

Schema evolution

Change table structure without rewriting data files or breaking downstream readers

Add columns

Add new columns at any position. Existing files return NULL for the new columns automatically.

ALTER TABLE t ADD COLUMN region STRING

Rename columns

Rename columns using column ID tracking. Zero data movement: existing files are not touched.

ALTER TABLE t RENAME COLUMN old_name TO new_name

Type widening

Widen column types (int to bigint, float to double, decimal precision increase) without rewriting data.

ALTER TABLE t ALTER COLUMN amount TYPE DECIMAL(20,4)

Nested evolution

Add fields to struct types and evolve map and array schemas at any nesting depth.

ALTER TABLE t ADD COLUMN addr STRUCT<zip: STRING, city: STRING>

Table maintenance

Keep tables fast and storage efficient with built-in maintenance operations

OPTIMIZE

Compact small files into larger ones to reduce per-query file-open overhead. Supports predicate-scoped compaction to limit the write amplification.

OPTIMIZE events WHERE date > '2024-01-01'

Z-ORDER

Co-locate related data across multiple columns using space-filling curves. Improves data skipping for multi-dimensional filter predicates.

OPTIMIZE events ZORDER BY (user_id, event_type)

VACUUM

Remove Parquet files no longer referenced by any live version. A DRY RUN mode shows which files would be deleted before committing the cleanup.

VACUUM events RETAIN 168 HOURS

ANALYZE

Compute column-level statistics (min, max, null count, histograms) used by the cost-based optimizer to skip files and reorder joins.

ANALYZE TABLE events COMPUTE STATISTICS FOR ALL COLUMNS

UniForm: Iceberg interoperability

Write once as Delta, expose as Iceberg with no data duplication

How it works

Enable UniForm and DeltaForge generates Iceberg metadata (metadata.json, manifest list, manifests) alongside the Delta transaction log on every commit. The same physical Parquet files are referenced by both metadata layers.

What you get

Delta readers see the Delta log. Iceberg-compatible engines see the Iceberg metadata. No ETL pipeline, no second copy of data, no synchronization lag.

ALTER TABLE events SET TBLPROPERTIES ('delta.universalFormat.enabledFormats' = 'iceberg')

Full Iceberg support details

Delta protocol coverage with native performance

ACID transactions, deletion vectors, change data feed, time travel, and Iceberg interoperability on open Delta Lake tables.