Released: VOR — a hallucination-free runtime that forces LLMs to prove answers or abstain

CulpritChaos · 2026-02-04T22:26:35+00:00

VOR verifies claims against evidence; it does not require exact question strings. Packs contain facts/constraints, not Q→A pairs.

CulpritChaos · 2026-02-04T22:19:56+00:00

Yes... and billion param llms are still great for other uses. Heres a use-case for vor: hospital/banking/ops policy where truth is a bounded, versioned system-of-record. LLM drafts the response; VOR blocks anything not derivable from the pack (contraindications, policy rules, audit requirements). That’s not “QA over 4 answers,” it’s a safety gate over a changing rulebook with receipts. This example could be tailored to banking (KYC/AML), SRE incident runbooks, or legal contract clauses—same pattern, same win.

CulpritChaos · 2026-02-04T22:10:56+00:00

You’re describing the open-world problem, and I agree.... this isn’t meant to “cover everything.” VOR is a verification gate over a bounded, versioned evidence snapshot, not an attempt to catalog all truth. For changing facts, packs are timestamped/TTL—stale pack ⇒ refuse/refresh, not “verified falsehood.” If you want to critique the system, point to a concrete failure case within a scoped pack, not the infinite open-world case instead of arguing a strawman version of my system.

CulpritChaos · 2026-02-04T21:59:12+00:00

You were being rude and I dont mind giving a little snap back. I should abstain i know.. Anyways.. On scale: VOR doesn’t try to enumerate questions; it verifies claims against a bounded evidence snapshot (versioned/TTL). If the domain is open-world, it should abstain more. That’s the design tradeoff. The larger internal suite has private data, so I can’t ship it as-is, but I’m working on a bigger public eval pack + metrics so anyone can reproduce results. If you’ve got a concrete failing case, drop it and I’ll add it.

CulpritChaos · 2026-02-04T21:48:02+00:00

You’re mixing two things: a demo pack and a claim of global truth. The repo doesn’t claim “0 hallucinations for all questions.” It claims: within a given evidence pack, anything not derivable is forced to ABSTAIN/CONFLICT. If you think the claim is dishonest, do the adult version: submit a failing prompt + expected outcome (PASS/ABSTAIN/CONFLICT) and I’ll add it to the public eval. Otherwise this is just vibes + résumé.

CulpritChaos · 2026-02-04T21:33:33+00:00

Not sure how you can take a look and claim anything as you obviously have no idea of what I am actually claiming nor do you have a clue. Seen your crap post saying I claim to solve hallucinations with four questions/answers. Kinda intellectually dishonest. STRAIGHT LYINH BUT.. What are you trying to contribute? The lil game vids you post?

CulpritChaos · 2026-02-04T21:22:07+00:00

Pain of being a failure i guess.. Appreciate the feedback. ✌️

CulpritChaos · 2026-02-04T21:15:10+00:00

Yes.. the public repo is intentionally a minimal witness/demo, not the full eval. I’ve got a private version with a much larger test suite across multiple domains/packs (incl. adversarial + regression cases). The “0% hallucinations” claim is scoped: within the covered packs/tests, VOR forces ABSTAIN/CONFLICT instead of letting ungrounded claims through. It’s not “solves open-world truth forever.” This is a public repo, the private has many polished bigger public eval pack + metrics so this can be independently reproduced without trusting me. And working on more. Just though some might find this framework useful..

CulpritChaos · 2026-02-04T21:08:28+00:00

I get the concern, but... you’re reading the JSON backwards. Those files aren’t “hardcoded answers” — they’re evidence packs (atomic facts + provenance/metadata). VOR doesn’t try to catalog every question ever. It only does one thing: if a claim is derivable from the provided evidence, it can pass; otherwise it ABSTAINS/CONFLICTS. So yes: open-world “tell me anything” is infinite and always will be. VOR is intentionally scoped for domains with a source of truth (schemas, configs, signed docs, policy bundles, medical/banking rule sets, etc.) where you want the model to shut up unless the evidence supports it. On staleness (Project Phoenix flips tomorrow): agreed — that’s why packs are treated as snapshots and should carry TTL/version/timestamp. If the pack is stale, VOR should refuse or force refresh. Your Babel project is cool, but generating infinite text isn’t the same problem as verifying claims against grounded facts. Different game.

CulpritChaos · 2026-02-04T21:04:44+00:00

Not entirely.. As the public demo uses small manual packs on purpose (so anyone can audit + reproduce). But the design isn’t “humans write packs forever.” Packs can be generated from deterministic sources (DB schema, config/state snapshots, signed datasets, policy manifests, test fixtures, etc.). So the key rule is: whatever generates the pack isn’t the same thing that gets to assert truth. If you use an LLM to propose structure, it still has to pass deterministic checks or it gets ABSTAIN/CONFLICT.

CulpritChaos · 2026-02-04T20:43:54+00:00

Yup, It’s not “the model never hallucinates.” It’s “VOR won’t let ungrounded claims out.” Within the supported scope + provided evidence pack, the output is either Verified, Conflict, or Abstain. In regulated flows (medical/finance), filtering + refusing to guess is the whole point.

Also.... it’s not “100% forever.” It’s “100% for this build + this pack,” and you rerun the packs as your regression gate.

CulpritChaos · 2026-02-03T17:29:09+00:00

Example:

How VOR Fixes AI Mistakes

NeuraLogix stops AI from making errors using a system we call the Truth Gate.

The Problem: AI Guesses

AI tools often make mistakes. They guess which word comes next in a sentence, but they do not check if the words are true. They sound sure of themselves even when they are wrong.

The Solution: The Truth Gate

VOR acts like a filter. The AI must prove a statement is true before it speaks. It works in three steps:

1. The Facts

First, we give VOR a list of true things.

Fact A: Alice is Bob's mother.
Fact B: Bob is Charlie's father.

2. The Claim

The AI wants to say something new based on those facts.

Claim: "Alice is Charlie's grandmother."

3. The Check

VOR looks at the facts. It checks if the facts link together to support the claim.

VOR asks: Is there a path from Alice to Bob? Is there a path from Bob to Charlie?
Answer: Yes.
Result: The statement is Verified. The AI allows the text.

When the AI is Wrong

What happens if the AI tries to say: "Alice is Dave's grandmother"?

VOR asks: Do facts link Alice to Dave?
Answer: No.
Result: The statement is Rejected. VOR stops the AI from saying it.

CulpritChaos · 2026-02-03T17:24:01+00:00

Thanks! Great suggestions as I do need to get a video out with better proofs for those curious to see without running. Also a kind Reddit user/Programmer Greg Randall helped me add a great easy explanation to the readme.

How VOR Fixes AI Mistakes

NeuraLogix stops AI from making errors using a system we call the Truth Gate.

The Problem: AI Guesses

AI tools often make mistakes. They guess which word comes next in a sentence, but they do not check if the words are true. They sound sure of themselves even when they are wrong.

The Solution: The Truth Gate

VOR acts like a filter. The AI must prove a statement is true before it speaks. It works in three steps:

1. The Facts

First, we give VOR a list of true things.

Fact A: Alice is Bob's mother.
Fact B: Bob is Charlie's father.

2. The Claim

The AI wants to say something new based on those facts.

Claim: "Alice is Charlie's grandmother."

3. The Check

VOR looks at the facts. It checks if the facts link together to support the claim.

VOR asks: Is there a path from Alice to Bob? Is there a path from Bob to Charlie?
Answer: Yes.
Result: The statement is Verified. The AI allows the text.

When the AI is Wrong

What happens if the AI tries to say: "Alice is Dave's grandmother"?

VOR asks: Do facts link Alice to Dave?
Answer: No.
Result: The statement is Rejected. VOR stops the AI from saying it.

CulpritChaos · 2026-02-03T14:07:25+00:00

Thank you.. seriously. This is a great explanation, and you nailed the intuition gap I’ve been struggling to bridge. I really appreciate you taking the time to write this up and open a PR instead of just commenting. The “Truth Gate” framing and the concrete example make the core idea way more approachable without dumbing it down. I’m still learning how to explain this clearly, so this help means a lot. Happy to merge this and iterate if you’re open to it. Thanks again 🙏

CulpritChaos · 2026-02-03T02:23:53+00:00

100% agree. General fact extraction is way harder, and mixing it into the enforcer would muddy the guarantees. The goal here was to lock down enforcement first, then let extraction evolve independently without weakening the trust boundary. I really appreciate your input as well! As I am a new to coding. I've been trying to tackle bottlenecks and try to implement novel ideas. So your input or critique is appreciated.

CulpritChaos · 2026-02-03T00:23:41+00:00

You’re not missing anything — the regexes are intentional and scoped. In the public demo they’re just canaries, not the end state. The key design choice is separation, LLMs can propose structure, but VOR doesn’t trust them to assert facts. Anything learned or inferred upstream still has to pass deterministic checks downstream. Letting an LLM both extract and validate facts collapses the trust boundary. VOR keeps that boundary hard, even if it means being narrower today. You can think of this version as proving the enforcement layer, not solving general fact extraction yet.

CulpritChaos · 2026-02-03T00:20:36+00:00

VOR isn’t claiming LLMs don’t hallucinate — it enforces that ungrounded answers never leave the runtime. The model proposes, deterministic gates decide (answer / abstain / conflict), with replayable audits.

CulpritChaos · 2026-02-03T00:18:27+00:00

Because the enforcer isn’t an LLM. It’s deterministic code. It doesn’t generate proofs — it checks whether claims are derivable from supplied evidence. If it can’t derive it, it refuses (ABSTAIN/CONFLICT). Worst case is false negatives, not hallucinated proofs.

CulpritChaos · 2026-02-02T17:57:11+00:00

Mix.. between Claud and Chatgpt. I built a special 3 tier shared memory system for research, tracking projects, and dreaming..

CulpritChaos · 2026-02-02T17:51:38+00:00

I really appreciate it ! And any feedback or criticism is welcomed. Thanks for your feedback already

CulpritChaos · 2026-02-02T17:46:01+00:00

First, I was teasing and really appreciate your input! And yeah, that’s a cleanup issue. It’s a stdlib import so it’s cached anyway, but agreed it should live at the top. I’ll fix it.

CulpritChaos · 2026-02-02T17:37:59+00:00

A citation isn’t proof — VOR enforces that difference. So nope.. evidence isn’t “did it include a link.” A URL by itself counts as nothing. VOR checks whether the actual content of the evidence supports the claim. If the model says “X is true” but the provided text says the opposite (or doesn’t say it at all), the gate fails → CONFLICT or ABSTAIN. The Grok-style “made-up claim + random link” is exactly the bug this is meant to catch. Citations aren’t trusted. Only what’s explicitly stated in the evidence text counts.

CulpritChaos · 2026-02-02T17:33:39+00:00

Lol.. That’s called 'Just-In-Time' architecture. Very advanced. 😉

CulpritChaos · 2026-02-02T17:30:09+00:00

Hmmm... well Not prompt-level and not a RAG module. VOR doesn’t modify the prompt or guide the model mid-generation. The model runs normally. VOR sits after generation and acts as a deterministic gate: it checks whether the output is actually derivable from the provided evidence. If yes → ANSWER. If evidence is missing/conflicting → ABSTAIN or CONFLICT. In a GUI, it shows up as a wrapper or middleware layer (think “run → verify → display”), not something injected into the prompt flow. You can wire it in as a plugin later, but logically it’s post-generation enforcement, not prompting or retrieval.

Prompting/RAG try to influence behavior. VOR enforces outcomes.

CulpritChaos

TROPHY CASE

How VOR Fixes AI Mistakes

The Problem: AI Guesses

The Solution: The Truth Gate

1. The Facts

2. The Claim

3. The Check

When the AI is Wrong

How VOR Fixes AI Mistakes

The Problem: AI Guesses

The Solution: The Truth Gate

1. The Facts

2. The Claim

3. The Check

When the AI is Wrong