A hybrid RDF reasoner for OWL 2 RL,
built for modern memory architectures.
Forward and backward chaining over OWL 2 RL semantics, with a SPARQL 1.1 frontend. Apache-2.0, EU-developed, open from the start.
Stage 1 — feasibility prototype running; 78 of 91 W3C OWL 2 RL cases green, the WCOJ engine clearing its acceptance gate
What it is, and why we're building it
HornDB is a triple store you reason over: load RDF, run SPARQL, get answers — with provenance back to the premises that produced them. The name is literal. Horn as in Horn clauses; OWL 2 RL is precisely the fragment of OWL 2 whose entailment rules are Horn rules evaluated bottom-up to a fixpoint, which is what this engine does.
The reasoner space today forces a choice. Pure-materialization commercial engines are fast but give up 100–1000× on backward chaining and are not open. Open-source toolkits are flexible but slower on the same workload. HornDB makes a different set of bets: materialize the cheap, regular subset; backward-chain the rest; and treat modern memory hierarchies as a first-class target rather than an afterthought.
The bets
Six design decisions set HornDB apart. The full rationale lives in the vision spec.
-
Hybrid execution
Not pure materialization. Materialize the schema and transitive-closure subset; backward-chain the rest with magic sets.
-
Modern memory as a target
A tiered working set — fast memory for the hot data, DRAM for warm, CXL/NVMe for cold — designed in, not bolted on.
-
Incremental maintenance
DBSP-style Z-set differences instead of DRed counting, so updates touch only what actually changed.
-
GraphBLAS closure
Schema-level transitive closure as a semiring matrix multiply on SuiteSparse:GraphBLAS.
-
Compiled rules
Soufflé-style ahead-of-time compilation: OWL 2 RL rules become native Rust — no rule interpreter in the hot path.
-
Provenance as a requirement
Every inferred triple traces back to its premises. Explainability is a hard constraint, not a feature flag.
What runs today
Stage 1 is a working engine, not a paper design. Every subsystem below has code in the repository, exercised by tests and the conformance harness.
-
Tiered storage shipping
Dictionary-encoded terms and predicate-partitioned columnar triples, with an N-Triples loader that already ingests RDF 1.2 triple terms.
-
Worst-case-optimal joins shipping
Leapfrog Triejoin with cost-based fallback to a binary hash join. Clears its acceptance gate — ≥10× a binary-join baseline on the canonical four-cycle workload (~34× measured on a skewed 10⁶-edge graph).
-
Compiled OWL 2 RL rules shipping
Entailment rules compiled ahead-of-time to native Rust — no interpreter in the hot path — evaluated semi-naïvely to a fixpoint. 78 of 91 selected W3C OWL 2 RL cases pass.
-
GraphBLAS closure shipping
Schema-level transitive closure runs as a semiring matrix multiply on SuiteSparse:GraphBLAS, linked in natively.
-
Incremental maintenance insertion-only
DBSP-style Z-set deltas update only what changed. Insertions ship today; retraction is Stage 2.
-
SPARQL 1.1 frontend shipping
Parser, algebra, planner and an HTTP endpoint answering SELECT / ASK / CONSTRUCT — including GROUP BY and aggregates — driven end-to-end against the LDBC Semantic Publishing Benchmark.
Where it's going
The harness comes first by design: every spec's acceptance criteria reference a concrete test subset, and a spec is not satisfied until its subset is green.
| Stage | Scope | Gate |
|---|---|---|
| Stage 0 | Harness bootstrap | Selected suite plumbing green; a deliberate failure is correctly flagged red. |
| Stage 1 in progress | Storage + WCOJ + a minimal OWL 2 RL slice | ≥50 W3C OWL 2 RL cases green; within 3× of a materialization baseline on LUBM-100. |
| Stage 2 | Full storage→SPARQL stack + RDF 1.2 triple terms | Full W3C OWL 2 RL + SPARQL 1.1 + Entailment Regimes green; ORE 2015 RL fragment solved. |
| Stage 3 | Hardware specialization | GPU GraphBLAS + WCOJ, CXL tiering, multi-node. The Stage-2 conformance bar does not drop. |
Non-goals: OWL 2 DL completeness, property-graph compatibility, or embedding-based “neural” reasoning as the source of truth. The symbolic engine is always the source of truth.