DeltaForge writes a Delta Lake or Iceberg UniForm table. Apache Spark 4.0 reads it back and verifies row count, content hash, and schema hash. Both engines must agree.
7,744 scripts across Delta Lake and Iceberg UniForm writes. Pass/fail counts appear after the first run.
INSERT (167 scripts), UPDATE (177), DELETE (167), MERGE (317). Every type, every null pattern, with and without deletion vectors and CDF.
ADD/DROP/RENAME/REORDER column (64 scripts), type widening (61), column mapping (57), generated columns (60), default values (82), CHECK constraints (48), nested struct evolution (43).
Partitioning (117 scripts), Z-Order (110), statistics correctness (104), predicate pushdown (56). Layout choices that the reader must honour.
Change Data Feed (100 scripts), deletion vectors (76), identity columns (93), in-commit timestamps (77), row tracking (56). Time travel and RESTORE (173). VACUUM (71).
A test only passes when all three independent verifications agree.
On failure, the record includes exact row count expected vs actual, hash mismatch with sample diverging rows, schema diff at the field level, and the full reader exception.
# df-sql/01_basic_data_files.sql
CREATE DELTA TABLE basic_data_files (
id BIGINT, order_number STRING, ...
) LOCATION '${TABLE_PATH}'
TBLPROPERTIES ('delta.enableDeletionVectors' = 'true');
INSERT INTO basic_data_files
WITH row_data AS (...)
SELECT ... FROM row_data;
# spark-reads-df/verify_01_basic_data_files.py
df = spark.read.format("delta").load(table_path)
assert df.count() == 372
assert content_hash(df) == expected_hash
assert schema_hash(df.schema) == expected_schema_hash
Or it shows up as a failure on the conformance dashboard. No other outcomes.