A hybrid RDF reasoner for OWL 2 RL,
built for modern memory architectures.

Forward and backward chaining over OWL 2 RL semantics, with a SPARQL 1.1 frontend. Apache-2.0, EU-developed, open from the start.

Stage 1 — feasibility prototype running; 78 of 91 W3C OWL 2 RL cases green, the WCOJ engine clearing its acceptance gate

What it is, and why we're building it

HornDB is a triple store you reason over: load RDF, run SPARQL, get answers — with provenance back to the premises that produced them. The name is literal. Horn as in Horn clauses; OWL 2 RL is precisely the fragment of OWL 2 whose entailment rules are Horn rules evaluated bottom-up to a fixpoint, which is what this engine does.

The reasoner space today forces a choice. Pure-materialization commercial engines are fast but give up 100–1000× on backward chaining and are not open. Open-source toolkits are flexible but slower on the same workload. HornDB makes a different set of bets: materialize the cheap, regular subset; backward-chain the rest; and treat modern memory hierarchies as a first-class target rather than an afterthought.

The bets

Six design decisions set HornDB apart. The full rationale lives in the vision spec.

  • Hybrid execution

    Not pure materialization. Materialize the schema and transitive-closure subset; backward-chain the rest with magic sets.

  • Modern memory as a target

    A tiered working set — fast memory for the hot data, DRAM for warm, CXL/NVMe for cold — designed in, not bolted on.

  • Incremental maintenance

    DBSP-style Z-set differences instead of DRed counting, so updates touch only what actually changed.

  • GraphBLAS closure

    Schema-level transitive closure as a semiring matrix multiply on SuiteSparse:GraphBLAS.

  • Compiled rules

    Soufflé-style ahead-of-time compilation: OWL 2 RL rules become native Rust — no rule interpreter in the hot path.

  • Provenance as a requirement

    Every inferred triple traces back to its premises. Explainability is a hard constraint, not a feature flag.

What runs today

Stage 1 is a working engine, not a paper design. Every subsystem below has code in the repository, exercised by tests and the conformance harness.

  • Tiered storage shipping

    Dictionary-encoded terms and predicate-partitioned columnar triples, with an N-Triples loader that already ingests RDF 1.2 triple terms.

  • Worst-case-optimal joins shipping

    Leapfrog Triejoin with cost-based fallback to a binary hash join. Clears its acceptance gate — ≥10× a binary-join baseline on the canonical four-cycle workload (~34× measured on a skewed 10⁶-edge graph).

  • Compiled OWL 2 RL rules shipping

    Entailment rules compiled ahead-of-time to native Rust — no interpreter in the hot path — evaluated semi-naïvely to a fixpoint. 78 of 91 selected W3C OWL 2 RL cases pass.

  • GraphBLAS closure shipping

    Schema-level transitive closure runs as a semiring matrix multiply on SuiteSparse:GraphBLAS, linked in natively.

  • Incremental maintenance insertion-only

    DBSP-style Z-set deltas update only what changed. Insertions ship today; retraction is Stage 2.

  • SPARQL 1.1 frontend shipping

    Parser, algebra, planner and an HTTP endpoint answering SELECT / ASK / CONSTRUCT — including GROUP BY and aggregates — driven end-to-end against the LDBC Semantic Publishing Benchmark.

Where it's going

The harness comes first by design: every spec's acceptance criteria reference a concrete test subset, and a spec is not satisfied until its subset is green.

StageScopeGate
Stage 0 Harness bootstrap Selected suite plumbing green; a deliberate failure is correctly flagged red.
Stage 1 in progress Storage + WCOJ + a minimal OWL 2 RL slice ≥50 W3C OWL 2 RL cases green; within 3× of a materialization baseline on LUBM-100.
Stage 2 Full storage→SPARQL stack + RDF 1.2 triple terms Full W3C OWL 2 RL + SPARQL 1.1 + Entailment Regimes green; ORE 2015 RL fragment solved.
Stage 3 Hardware specialization GPU GraphBLAS + WCOJ, CXL tiering, multi-node. The Stage-2 conformance bar does not drop.

Non-goals: OWL 2 DL completeness, property-graph compatibility, or embedding-based “neural” reasoning as the source of truth. The symbolic engine is always the source of truth.