Unturf AGI: Merkle Providence Reverse RAG

Unturf Automated General Intelligence: Merkle Providence Reverse RAG

Provenance-Preserving Cache Chains for Small Language Models

How Merkle trees turn Reverse RAG into a verifiable, shared knowledge layer.

Russell Ballestrini <russell@unturf.com>, David Wong <david.s.wong@pm.me>, Riley Morgan <guybriley02@gmail.com>

Download PDF

Abstract

Reverse Retrieval Augmented Generation (Reverse RAG) showed that client-side context injection lets small 8B-parameter language models outperform much larger models on page-specific questions. Our reference inference endpoint runs Hermes 3, a NousResearch instruction fine-tune of Meta's Llama 3.1 8B, served freely at uncloseai.com. The client already holds the document. No vector database. No embedding pipeline. No retrieval failures.

Merkle Providence Reverse RAG extends this with a second insight: the client not only holds the document, it can remember what it already computed about that document, prove that memory came from an unmodified source, & share that verified knowledge with others on public or private chains.

A Merkle tree over document content chunks produces a root hash: a 32-byte fingerprint that changes when any byte of the document changes. Pairing this root hash with a question hash, a model identity, & a small set of governance dimensions yields a content-addressed cache key. A cache hit returns a previously computed answer with a Merkle proof of origin; a cache miss triggers fresh inference, classifies the result against its source, & extends the chain.

Our reference implementation is Aborist, a Python content-addressed document store that ingests entire corpora at once, computes Merkle commitments at scale, classifies each answer's faithfulness against its sources with a deterministic lexical verifier, & serves cached answers to any peer that produces a matching cache key. Aborist runs against hermes.ai.unturf.com by default & ships under AGPL-3.0-only at git.unturf.com/engineering/unturf/aborist.

Once published, this technique forces a reckoning: every RAG system that cannot prove provenance of its retrieved chunks operates on unverifiable context. Merkle Providence makes provenance a first-class primitive.

We call this stack Web 2.5: the existing web's documents & URIs, augmented with machine learning context injection & Merkle-verified answer chains, without requiring a blockchain, a new protocol, or replacement of any existing infrastructure. Web 2.5 runs on top of what already exists. Any page. Any browser. Any small model. Public or private chain. No permission required.

What's in the paper

The Practitioner With a Phone — commodity hardware + remote inference + local provenance is enough for AGI participation. No GPU farm, no corporate gatekeeper. A browser, a small model, & a Merkle root.

Reference Stack & Reverse RAG Primer — Hermes 3 Llama 3.1 8B at hermes.ai.unturf.com; why 8B is a thesis, not a constraint.

Merkle Trees as Content-Addressed Cache Keys — determinism + tamper sensitivity. Git & Mercurial commit hashes are already Merkle roots.

The Eight-Dimensional Cache Key — source_root, question_hash, model_profile_hash, conversation_hash, governance_policy_hash, schema_version, canonicalization_version, chunking_version. Each dimension partitions the cache namespace cleanly.

The Audit Chain & Falsification States — tamper-evident local log; live / failed / stale / quarantined; falsify (audit-preserving) vs. burn (kindergarten leaf-delete).

Aborist — Reference Architecture — surface / core / providence cache stacked on one SQLite file. 67% storage reduction measured on the 2003-05-16 enwiki cur dump (128,245 documents). Soft hash vs. hard hash discipline.

The Layered Verifier — Lexical, No LLM in the Proof Path — quote, span, entity, paraphrase strategies. Deterministic byte-for-byte. Wikitext-base normalization runs before the LLM AND inside the verifier.

Claim-Lattice Mode — Runtime Owns Quote Text — the model never types a quote string; it types a pointer. Synthetic-elision structurally impossible. Pointer-line & JSON modalities both first-class. Eight deterministic hard checks, including anchor-class warrant (proper-noun, date, entity-list, count with digit↔word equivalence, why-cause) and source-title relevance. Nine-stage run-DAG.

The Four-Rung Ladder — POINTER-LINKED → ANCHOR-WARRANTED → EVIDENCE-WARRANTED → (ENTAILMENT-VERIFIED reserved), with UNGROUNDED below. Each rung names a strictly stronger property.

Federation — Snapshot Roots & Mesh — one hash names the corpus at a point in time. Ed25519 signs envelopes; X25519+ChaCha20-Poly1305 wraps per-epoch secrets. Off by default. Worked example: 38 GB enwiki snapshot at 3.47M docs.

Personal vs. Public Chains — same primitive supports both. Document content never travels; proofs do.

The Truth-Seeking Ratchet — emergence preserved, not suppressed. aborist reclassify promotes records up the ladder when new evidence arrives. The ratchet only goes one way: guess → warranted record.

Decisions and Constraints — eleven first-principles rules: verifier stays binary, runtime owns quotes, cores never evict, one-shot discipline, idempotent re-ingest, local-first, emergence preserved, soft hash never enters proof path, cache is not consent, governance partitions the cache (not the corpus), labels name properties (not vibes).

What This Proves and What It Does Not — structural-and-lexical floor named honestly; semantic ceiling named honestly. Calibration is the architecture's promise.

Forcing the Issue — once Merkle-proved answers exist, every RAG without provenance carries an implicit disclaimer. Medical, legal, financial, security domains adopt first.

Web 2.5 & AGI as a Commons — Roadmap & Invitation — no blockchain, no new protocol, no platform replacement. The practitioner's onboard in six concrete steps.

Related Work — Reverse RAG, RAG, Git/Mercurial object models, Zero-Knowledge Proofs, Merkle-AGI v9.8 substrate, Merkle's original construction.

Unturf Automated General Intelligence: Merkle Providence Reverse RAG

Abstract

What's in the paper

Citation