How do I query files in object storage with SQL?

Create an external table that points at the files; DeltaForge runs schema discovery automatically, so the table is queryable with SQL immediately. CSV, Excel, JSON, XML, Avro, ORC, Protobuf, EDI, HL7, and FHIR all go through the same path, and results can land in Delta tables on your own storage.

Which file formats can DeltaForge read?

Thirteen formats through a single SQL surface: Delta Lake, Apache Iceberg, Parquet, ORC, Avro, CSV, JSON (including NDJSON), XML, Excel, EDI, HL7, FHIR, and Protobuf. Every format is a real code path in the engine.

How does the Visual Flattener handle nested data?

Point the flattener at a file, browse the discovered tree of paths and types, then decide per field whether to pull it out as a column, explode array elements into rows, keep a subtree as a JSON blob, or let the engine flatten automatically. One visual experience covers JSON, XML, EDI, HL7, FHIR, and Protobuf.

Does DeltaForge infer schemas for CSV and JSON?

Yes. The inference engine samples rows, detects locale-specific separators and formats across more than 40 locales (German DD.MM.YYYY, French 1 234 567,89, US MM/DD/YYYY), and emits SQL cast expressions plus a ready-to-use transform view.

Can DeltaForge read Iceberg tables written by other engines?

Yes. The native Iceberg reader consumes tables produced by any engine that emits standard Iceberg metadata. On the write side, DeltaForge writes Delta tables with Iceberg UniForm metadata alongside them.

Query CSV, Excel, XML, EDI & HL7 Files with SQL

Formats you can read and write

Every format listed here is a real code path in the engine, not a roadmap item.

Delta Lake and Iceberg

Delta tables with full ACID and time travel. Iceberg UniForm metadata written alongside Delta so Iceberg-native readers can consume your tables. Native Iceberg reader for tables written by other engines. Deep dives: table format and Iceberg support.

Columnar: Parquet, ORC, Avro

Column pruning and predicate pushdown for Parquet and ORC. Schema evolution across Avro files with type promotion and null-filling. Compression codec auto-detection in all three.

Text: CSV, JSON, NDJSON, XML, Excel

Culture-aware type inference across 40+ locales for CSV. Subtree capture for JSON and XML so nested objects stay as JSON columns. Multi-sheet Excel with header detection and per-sheet type inference.

Industry: EDI, HL7, FHIR, Protobuf

Segment-based EDI flattening, HL7 component and field aliasing, FHIR resource bundle unbundling, and Proto3 binary parsing with schema-driven decoding and enum resolution. Deep dives: EDI, FHIR, and healthcare data.

Visual Flattener for nested formats

Point at a file. Browse the tree. Decide what to flatten, explode, or keep as a JSON blob. The configuration persists to the table and applies on every query.

Interactive tree view

Discover all nested paths, types, and sample values automatically. Works identically across JSON, XML, EDI, HL7, FHIR, and Protobuf: one visual experience for all six.

Per-field control

Choose how each path lands in the table: pull it out as a column, explode array elements into separate rows, keep a nested subtree as a JSON blob, or let the engine flatten it automatically.

Schema evolution

Files with different structures are merged automatically. New paths null-fill older rows. Path aliases map multiple source paths to one output column.

Schema inference on CREATE EXTERNAL TABLE

Creating an external table runs schema discovery automatically. The engine samples your files, infers the types, and generates the cast expressions; no manual schema definitions are required for CSV, JSON, or XML.

Culture-aware parsing

German dates (DD.MM.YYYY), French decimals (1 234 567,89), US dates (MM/DD/YYYY): the inference engine detects locale-specific separators and formats automatically across 40+ locales.

Auto-generated transform views

Inferred types produce SQL cast expressions and a ready-to-use transform view. Schema merging is configurable: accept new columns, require exact agreement, or restrict to the common set. Null-fill and type widening are handled automatically.

Read any format. Write to Delta.

Formats you can read and write

Delta Lake and Iceberg

Columnar: Parquet, ORC, Avro

Text: CSV, JSON, NDJSON, XML, Excel

Industry: EDI, HL7, FHIR, Protobuf

Visual Flattener for nested formats

Interactive tree view

Per-field control

Schema evolution

Schema inference on CREATE EXTERNAL TABLE

Culture-aware parsing

Auto-generated transform views

Further reading

CSV to Delta Lake without Spark

Query Excel files with SQL

Query EDI files with SQL

HL7 to SQL without an interface engine

FHIR analytics without Spark

Connect your data sources