Does the DeltaForge SQL engine support window functions?

Yes. All standard window functions are supported: ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK, CUME_DIST, FIRST_VALUE, LAST_VALUE, NTH_VALUE, LAG and LEAD. Frames are supported in all three modes (ROWS, RANGE, GROUPS) with frame exclusion clauses.

Can I write PL/pgSQL stored procedures against Delta Lake tables?

Yes. DeltaForge implements a PL/pgSQL-style procedural language with scalar variables, RECORD types, percent-TYPE and percent-ROWTYPE references, arrays, CONSTANT declarations, IF and CASE branches, LOOP and WHILE and FOR loops, EXCEPTION blocks, RAISE NOTICE/WARNING/EXCEPTION, and cursors.

Does the SQL engine support graph queries on Delta tables?

Yes. The engine accepts Cypher property-graph queries (MATCH, WHERE, RETURN, OPTIONAL MATCH and variable-length paths) over Delta Lake tables that are modelled as nodes and relationships. Cypher and SQL share the same planner.

How does the cost-based optimizer use Delta Lake statistics?

The optimizer reads column-level min, max, null-count and histogram statistics from the Delta transaction log and from ANALYZE TABLE output. It uses these to prune partitions, skip Parquet files via min/max bounds, reorder joins by estimated cardinality, fold constants, and eliminate dead predicates.

Delta Lake SQL Engine: Full DML, No Spark

Q: Which SQL dialect does the engine target?

A PostgreSQL-leaning dialect with Delta Lake extensions (VERSION AS OF, TIMESTAMP AS OF, MERGE INTO, OPTIMIZE, ZORDER BY, VACUUM).

Function library

Common analytical functions across numeric, string, temporal, and aggregate categories

Math and numeric

Arithmetic, exponential, logarithmic, trigonometric, and hyperbolic functions. Decimal arithmetic with exact precision. Rounding, bucketing, and number-theory helpers.

String

Substring, pattern matching, regex, case conversion, padding, concatenation, encoding, hashing (md5, sha256, sha512), and array-to-string operations.

Date and time

Timestamp arithmetic, date truncation, component extraction, timezone conversion, interval operations, and calendar functions (day-of-week, quarter, week-of-year).

Aggregates

Standard aggregates plus statistical functions: variance, standard deviation, correlation, covariance, linear regression coefficients, percentiles, and mode.

Window functions

Full frame specification including ROWS, RANGE, and GROUPS modes with exclusion clauses

Ranking

ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK, CUME_DIST. Partition and order clauses supported on all ranking functions.

Value access

FIRST_VALUE, LAST_VALUE, NTH_VALUE, LAG (previous row), and LEAD (next row) with default-value support for boundary rows.

Frame modes

ROWS BETWEEN (physical row offsets), RANGE BETWEEN (logical value ranges, including interval-based ranges for time series), and GROUPS BETWEEN (peer group frames).

Frame exclusion

EXCLUDE NO OTHERS, EXCLUDE CURRENT ROW, EXCLUDE GROUP, and EXCLUDE TIES for precise control over which rows count in each frame.

Window query example

SELECT
    customer_id,
    order_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total,
    AVG(amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW
    ) AS moving_avg_7d
FROM orders;

PL/pgSQL-style stored procedures

Procedural logic runs inside the engine, alongside your SQL, with full transaction semantics

Variables and types

Scalar variables, record types, %TYPE column references, %ROWTYPE table row types, array variables, and CONSTANT declarations with default values.

Control flow

IF/THEN/ELSIF/ELSE, CASE WHEN, LOOP/EXIT/CONTINUE, WHILE, FOR over integer ranges and query result sets, and FOREACH over arrays.

Exception handling

EXCEPTION WHEN blocks with predefined exception types, RAISE EXCEPTION/NOTICE/WARNING, SQLSTATE error codes, and SQLERRM error messages.

Cursors

Implicit cursor FOR loops, explicit cursor declaration, parameterized cursors, OPEN/FETCH/CLOSE lifecycle, and REFCURSOR return type for set-returning functions.

PL/pgSQL stored procedure example

CREATE OR REPLACE FUNCTION process_monthly_billing(
    p_month DATE,
    p_discount_threshold DECIMAL DEFAULT 1000.00
) RETURNS TABLE(
    customer_id INT,
    total_amount DECIMAL,
    discount_applied DECIMAL,
    final_amount DECIMAL
) AS $$
DECLARE
    v_customer RECORD;
    v_total DECIMAL;
    v_discount DECIMAL;
    v_processed INT := 0;
BEGIN
    FOR v_customer IN
        SELECT id, tier FROM customers WHERE status = 'active'
    LOOP
        SELECT COALESCE(SUM(amount), 0) INTO v_total
        FROM orders
        WHERE customer_id = v_customer.id
          AND date_trunc('month', order_date) = date_trunc('month', p_month);

        v_discount := CASE
            WHEN v_customer.tier = 'platinum' AND v_total > p_discount_threshold THEN v_total * 0.15
            WHEN v_customer.tier = 'gold'     AND v_total > p_discount_threshold THEN v_total * 0.10
            WHEN v_total > p_discount_threshold * 2 THEN v_total * 0.05
            ELSE 0
        END;

        customer_id      := v_customer.id;
        total_amount     := v_total;
        discount_applied := v_discount;
        final_amount     := v_total - v_discount;
        RETURN NEXT;
        v_processed := v_processed + 1;
    END LOOP;

    RAISE NOTICE 'Processed % customers for month %', v_processed, p_month;
EXCEPTION
    WHEN OTHERS THEN
        RAISE EXCEPTION 'Billing error: % - %', SQLSTATE, SQLERRM;
END;
$$ LANGUAGE plpgsql;

Frequently asked questions

Common questions about running SQL on Delta Lake without Spark

Can I query Delta Lake without Databricks?

Yes. DeltaForge is a standalone SQL engine that reads and writes Delta Lake tables directly in object storage or on local disk. It needs no Databricks workspace, no Spark cluster, and no JVM. BI tools connect through the native ODBC driver.

Can I run MERGE on Delta Lake without Spark?

Yes. MERGE INTO, UPDATE, DELETE and INSERT are native SQL statements executed against the Delta transaction log with ACID guarantees. Every release is round-trip verified against Spark in the public conformance suite: DeltaForge writes, Spark reads, and vice versa.

Which SQL dialect does the engine target?

A PostgreSQL-leaning dialect with Delta Lake extensions: VERSION AS OF, TIMESTAMP AS OF, MERGE INTO, OPTIMIZE, ZORDER BY, and VACUUM. The same engine also reads and writes Iceberg; see the table format page.

The Delta Lake SQL engine, no Spark required