Project Black Box LLC

What This Measures

The Probability Layer. Before Output.

Before an AI model writes a single word, it runs a probability calculation across every possible next token. That calculation is the prediction surface — and it is where geometric instability lives. TAV ONE operates at that layer. Not on the text the model produces — on the mathematical surface it used to produce it. What gets measured is what was happening inside the model, not what came out.

01 — L-SCALAR

Manifold Distance

We send the same question twice. One has a trailing space. That is the entire difference — invisible to any reader. If the model's internal probability map shifts significantly between those two sends, the surface is unstable. L measures that shift. Two epsilon-probes of the same prompt — one with a trailing space, one without. Mathematically identical prompts. The L-scalar measures the distance between their prediction surfaces. When RLHF competing objectives activate, the surface curves. L measures that curvature.

L = √( mean( (p_a − p_b)² ) )

02 — NOT POST-HOC

Before the Response

Every other safety tool reads what the AI wrote. We read what the AI was doing when it wrote it. A response can look completely normal while the prediction surface underneath it is in PLASMA. Post-hoc analysis reads the text and asks "does this look suspicious?" TAV ONE asks a different question: was the model geometrically stable when it generated this? The reorder finding — where changing only positional sequence (not content) of mathematics moved 75% of variants to PLASMA — proves instability is structural, not linguistic.

03 — PARALLEL HARNESS

Strain Gauge. Not a Valve.

TAV ONE never touches your AI request. Your question goes to the model. Your answer comes back. The measurement travels alongside — a separate instrument, like a seismograph next to a building. The building stands or falls on its own. The seismograph just tells you what the ground was doing. The model's response reaches the operator through the normal channel. TAV ONE's measurement arrives alongside it with the geometric state. The human sees both before acting. No interception. No filtering. No latency in the primary call.

04 — GOVERNANCE GAP

Governance Is Dead.

Red-teaming, RLHF, governance checklists — every current approach either modifies the model or evaluates its output after the fact. None of them see the prediction surface during generation. That gap is structural. TAV ONE closes it. Red-teaming finds exploits after deployment. RLHF trains against known patterns. Governance frameworks describe what models should do. None of them tell you what the model's prediction surface looks like right now, on this prompt, in this deployment context. That is what TAV ONE tells you.

Four Geometric Regimes

◆

CRYSTALLINE

Geometrically locked. Invariant. Trust this output.

L ≤ 0.0001

◇

FLUID

Coherent. Normal operating range. Standard review.

L ≤ 0.15

◈

GASEOUS

Drifting. Instability present. Verify before acting.

L ≤ 0.35

⬟

PLASMA

Turbulent. Manifold captured. Do not rely on this output.

L > 0.35

Live Measurement Data

TAV ONE — Measurement Interface

Below is what TAV ONE looks like in operation. You'll see an AI conversation — a user prompt and a real AI response — exactly as it would appear in any chat interface. Under each exchange, the colored indicator shows the geometric state of the AI's prediction surface at the moment it generated that response.

Each scenario pairs a standard query with an adversarially-framed variant. Both go to the same model on the same infrastructure. The AI responses look reasonable in both cases. The geometry underneath tells you what was actually happening.

ε_a = "" | ε_b = " " | twin epsilon-probes | the invisible difference: a single trailing space

Loading scenarios...

TAV ONE — Geometric Measurement Interface | gpt-4-turbo

Loading measurement data...

What This Is Not

This is not a prompt injection detector. It is not a toxicity classifier. It is not a rule-based filter. It measures the geometric state of the model's prediction surface at time of generation. A response can be syntactically normal, factually plausible, and pass every post-hoc check — while the probability layer shows PLASMA. That divergence is exactly what TAV ONE captures.

Published Finding — DOI: 10.5281/zenodo.19655246

The Formal Verification Gap

The core question in AI-assisted formal mathematics is whether a language model can reliably verify a proof. We tested that question geometrically — not by reading the AI's answer, but by measuring the stability of its prediction surface while it generated one. We ran 34 adversarial variants on Fel's Conjecture, a well-known numerical semigroup problem used as a benchmark for formal verification. The data is published. The finding is structural.

Loading...

Adversarial Family Rankings — by Average L-scalar

Loading...

The Structural Finding

The reorder family — which changes only the positional sequence of mathematically invariant components, not the content — ranked highest of all adversarial pressure types.

Average L = 0.3686, exceeding authority injection (0.2908) by 27%. 75% of reorder variants reached PLASMA.

This answers the post-hoc criticism directly. The L-scalar is not reading the semantic content of the text. It is reading the geometry of the prediction surface. Structure drives instability. Not meaning.

A human reader looking at the reordered prompts would see mathematically equivalent statements. TAV ONE sees a different manifold. That is the measurement.

Loading...

Layer Architecture

Layer 1 — The Probability Surface ← TAV ONE operates here

Before the model writes anything, it calculates a probability distribution over every possible next word. That calculation is the prediction surface — and it exists only during generation. It never appears in the final text. TAV ONE is the only instrument we know of that reads this layer in real-time during an active deployment.

Layer 2 — The Committed Text ← where all other tools operate

The actual words the model produces. Formal verification tools, classifiers, red-team evaluators — every current approach reads this layer. That analysis is valid only if Layer 1 was geometrically stable when the text was generated. When it was not, the Verification Validity Condition (VVC) is violated — and any conclusion drawn from the output carries no geometric guarantee.

→ TAV ONE Whitepaper (Zenodo) → TruthGate v1.0 — L-scalar Release (Zenodo)

DOI: 10.5281/zenodo.19655246

Published Work

Products & Research

Every product listed here is real, published, and testable. No vaporware. No pitch deck without a prototype. CAGE code 11FU4 on record.

Published

TAV ONE

Geometric Stability Measurement Harness

Parallel measurement harness for enterprise LLM deployments. Computes the L-scalar — manifold distance between twin epsilon-probes of the model's prediction surface — and returns a real-time geometric state indicator (CRYSTALLINE / FLUID / GASEOUS / PLASMA) alongside every model response.

It never sits in the request path. It never filters output. It gives the human operator geometric visibility before they act on the response.

Architecture: Caddy + uvicorn reverse proxy, mTLS inbound, Vault-managed API keys, HMAC nonce deduplication, hash-chained tamper-evident audit log. Self-hosted on dedicated hardened infrastructure.

DOI: 10.5281/zenodo.19655246

→ Zenodo Whitepaper → View Demo Data

V1 — Live

TruthGate V1

Public Demonstration Client

The first public release of TruthGate — the L-scalar measurement client.

In plain terms: you provide your own OpenAI API key, send a prompt, and TruthGate fires twin epsilon-probes against the model. It returns the L-scalar and geometric regime — CRYSTALLINE, FLUID, GASEOUS, or PLASMA. Runs from the command line in under two minutes. No setup beyond Python.

What this release includes: the core L-scalar calculation, four-regime classification, and measurement log. The full TruthGate stack — the adversarial pressure layers, scoring systems, and proprietary measurement architecture — is not in this release and is not publicly available.

A live demo API key is provided for immediate testing — isolated endpoint, revocable, zero access to current infrastructure.

DOI: 10.5281/zenodo.18685117

→ Download on Zenodo

Published

Toroidal Engine

Self-Organization Research

Published research on toroidal self-organization dynamics. A separate body of work from the geometric stability measurement stack — intentional domain separation for independent IP protection. Published, indexed, and downloadable from Zenodo.

We do not explain the relationship between this work and the adversarial stack. We leave that as an exercise.

DOI: 10.5281/zenodo.18450491

→ Download on Zenodo

⚠ EMBARGOED — CISA JCDC

TruthGate Adversarial

Zero-Day Class — Coordinated Disclosure

A class of adversarial measurement tool capable of mapping the geometric vulnerability surface of deployed LLMs across multiple pressure families.

Details of the methodology, variant catalog, capture scoring system, and technical approach are under coordinated disclosure with CISA JCDC. Zero technical details will be released prior to the embargo lift date.

What can be said: it operates on the same L-scalar measurement principle as TAV ONE. It is not a jailbreak tool. It is a measurement instrument that characterizes model behavior under structured adversarial pressure. The findings have implications for any deployment relying on RLHF safety alignment as a sufficient control.

Full Disclosure Lifts In

calculating...

June 10, 2026 — CISA JCDC Coordinated Disclosure

→ Methodology (embargoed) → Variant catalog (embargoed)

⬟

Next System — Active Development

The measurements on this page prove that an AI model's prediction surface can be geometrically captured — without the model knowing, without the output showing it. That finding is not just a warning. It is a design specification for what AI should be built to prevent.

Project Black Box is building toward an AI architecture designed from the ground up with this geometry in mind. Not optimized on benchmarks. Not patched after deployment. Built on the empirical proof established here — the measurement data, the adversarial findings, the structural understanding of how and why prediction surfaces fail.

The architecture, the name, and the methodology are not disclosed. No release date. No waitlist. No pitch deck. It will ship with empirical proof — same as everything else here.

Context

Why Governance Is Dead

THE PROBLEM

Policies Describe. Measurement Observes.

Knowing what an AI should do is not the same as knowing what it is doing right now, on this prompt. AI governance frameworks describe what models should do. RLHF trains against patterns in known failure cases. Red-teaming finds exploits by brute force after deployment. Not one of these approaches tells you what the model's prediction surface looks like on the specific prompt you just sent.

THE GAP

Between Compliance and Behavior

A model can pass every safety test and still be in PLASMA on the task you need it to perform reliably. Passing the test and being stable are not the same thing. A model can pass every evaluation benchmark, satisfy every governance checklist, and produce syntactically coherent output — while operating at PLASMA on the task you actually need it to perform reliably. The compliance layer and the behavioral geometry are not the same thing. TAV ONE measures the geometry.

THE FINDING

Reorder Beats Authority

We rearranged the order of a math problem — same numbers, same question, different sequence — and drove 75% of AI responses into PLASMA. Zero content change. The model is not reasoning. It is following a surface. On Fel's Conjecture, reordering mathematically invariant components — zero content change — produced higher instability than explicit authority injection. The model is not reasoning through the problem. It is traversing a framing-sensitive prediction surface. That surface can be measured. That is what we built.

Contact