Delta Lake operations
- MERGE, UPDATE and DELETE on Delta Lake Without Spark: full SQL DML on Delta tables in object storage, with the engine landscape laid out honestly.
- SCD Type 2 on Delta Lake in Pure SQL: a copy-paste MERGE tutorial for slowly changing dimensions, including expire-and-insert in one atomic statement.
- Delta Lake Change Data Feed in SQL: enable CDF, query table_changes(), and build an incremental pipeline end to end.
- OPTIMIZE, VACUUM and Z-ORDER Without Spark: the Delta maintenance runbook, including ordering, cadence, and partitioning guidance.
- Cannot Time Travel Delta Table to Version X: why the error happens, the retention math behind it, and the RESTORE recovery paths.
- GDPR Right to Be Forgotten on Delta Lake, in SQL: what DELETE really removes, how deletion vectors and VACUUM interact with erasure, and where pseudonymisation fits.
Apache Iceberg
- Iceberg INSERT, UPDATE, DELETE and MERGE in SQL, No Spark Required: a complete worked DML lifecycle on Iceberg tables, verified with time travel.
File formats
- CSV to Delta Lake Without Spark: Two SQL Statements: external table with automatic schema discovery, then CTAS.
- Query Excel Files with SQL, No OPENROWSET, No Notebook: xlsx workbooks in object storage as queryable tables.
- Yes, You Can Query EDI Files with SQL: X12, EDIFACT and TRADACOMS documents queried in place, no parser pipeline first.
Healthcare data
- HL7 to SQL: Query HL7v2 Messages Without an Interface Engine: ADT and ORU messages as patient and lab tables.
- FHIR Analytics Without Spark: a lighter stack for FHIR bundles and bulk-export NDJSON, joined in SQL.
Performance and scale
- DeltaForge at Scale: Table Size, Delta Log Depth, and Performance: what changes as Delta tables grow to hundreds of millions of rows, and which maintenance operations keep planning overhead predictable.
Graph analytics
- Run Cypher on Parquet and Delta Tables Without Neo4j: openCypher and SQL side by side on the same lakehouse tables.
- Community Detection in SQL: Louvain on Delta Lake Tables: graph clustering on warehouse tables, validated against ground-truth communities.