The Interpolation Betrayal: Part II

The Crime Scene

In Part I, we dismantled the myth that AI models are discovering “Platonic Forms” of reality. We found that they haven’t escaped the cave; they’ve just memorized the shadows.

But if they aren’t modeling reality, what are they doing?

To understand the glitched physics of the latent space, we must confront a fundamental misunderstanding that plagues modern AI deployment. We’re making a Categorical Error—applying the logic of Classical Machine Learning to Generative AI.

They look similar (gradient descent, GPUs, loss functions), but epistemologically, they’re opposites:

Classical ML: Interpolates within a bounded, supervised domain
Transformers: Interpolates across unbounded discourse

This isn’t just semantics. This is why your RAG pipeline hallucinates “facts” with perfect confidence.

Here’s my field report.

Act I: The Evidence

Clue #1: The Two Types of Interpolation

In Classical ML (cancer detection, house price prediction), interpolation is geometric. We assume data points sit on a “manifold of reality.”

Example: House Prices

Training data: 1,000 sq ft house → $200K
Training data: 2,000 sq ft house → $400K
Interpolation: 1,500 sq ft house → $300K

This works because:

The ground truth is supervisable (you can measure actual sale prices)
The domain is bounded (houses have physical constraints)
Error is objective (predicted price vs. actual price)

Transformers break this contract.

A Transformer doesn’t model houses or tumors. It models text describing houses and tumors.

When an LLM interpolates, it isn’t finding the midpoint between two physical facts. It’s finding the semantic center of gravity between two sentences.

Try it yourself:

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# Get embeddings for two opposing statements
text_a = "The virus is extremely dangerous"
text_b = "The virus is completely harmless"

inputs_a = tokenizer(text_a, return_tensors='pt')
inputs_b = tokenizer(text_b, return_tensors='pt')

with torch.no_grad():
    embed_a = model.transformer.wte(inputs_a.input_ids).mean(dim=1)
    embed_b = model.transformer.wte(inputs_b.input_ids).mean(dim=1)

    # Geometric interpolation
    embed_mid = (embed_a + embed_b) / 2

    # Find nearest tokens to midpoint
    all_embeddings = model.transformer.wte.weight
    distances = torch.cdist(embed_mid, all_embeddings)
    nearest_tokens = torch.argsort(distances[0])[:10]

    print("Geometric midpoint contains tokens:")
    for token_id in nearest_tokens:
        print(f"  {tokenizer.decode([token_id])}")

What you’ll find: Words like “controversial,” “debated,” “uncertain,” “opinions vary.”

The model has successfully interpolated the discourse (how people argue about viruses), but in doing so, it has betrayed the physics (the virus either is or isn’t dangerous—reality doesn’t average).

Clue #2: The Exogenous Void

This brings us to the deepest crack in the foundation: the difference between Propositions and Assertions.

Definitions:

Assertion: A claim grounded in an external verification loop (an exogenous variable)
- Example: Thermometer reads 30°C (verified by physics)
- Example: Patient’s bloodwork shows elevated WBC (verified by lab equipment)
Proposition: A coherent sentence that is internally consistent but externally unverified (endogenous variables only)
- Example: LLM generates “Patient likely has sarcoidosis” (generated by token geometry)
- Example: LLM says “Coffee causes plane tickets” (plausible syntax, no causal reality)

Transformers are Proposition Generators.

They’re closed systems. They see Token A and Token B. They calculate P(Token C | A, B). They’re blind to:

The laws of physics
The passage of time
Laboratory measurements
Sensor readings
External verification of any kind

The dangerous illusion: When the model says “The capital of France is Paris,” it sounds like an assertion. But internally, the model is proposing:

“In my training corpus, the token ‘Paris’ has high pointwise mutual information with ‘France’ and ‘capital.’”

It’s reporting a statistical relationship, not a geographic fact.

Try it yourself:

# Ask the model about something it has no external grounding for
prompts = [
    "My grandmother just called and",  # Model can't know this
    "The coffee I just made will",     # Model can't see your coffee
    "Tomorrow's lottery numbers are",  # Model can't predict future
]

for prompt in prompts:
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs.input_ids, 
        max_length=50,
        do_sample=True,
        temperature=0.7
    )
    completion = tokenizer.decode(outputs[0])
    print(f"\nPrompt: {prompt}")
    print(f"Model: {completion}")
    print("Source: Pure interpolation over training corpus")

The model will confidently complete all three, even though it has zero access to:

Your grandmother’s current status
Your coffee’s future effects
Tomorrow’s random number generation

It’s not lying—it’s doing exactly what it was trained to do: generate plausible continuations based on corpus statistics.

Act II: The Investigation

Clue #3: The Spectrum of Untruth

Because the model optimizes for perplexity (statistical surprise) rather than factuality (external truth), it cannot distinguish between degrees of wrongness.

To a Transformer, these are mathematically identical if they appear with similar frequency:

Statement	Type	Corpus Frequency	Model Confidence
“Water is H₂O”	Truth	High (textbooks)	High
“Napoleon was short”	Myth	High (pop culture)	High
“Dragons breathe fire”	Fiction	High (fantasy novels)	High
“Vaccines cause autism”	Disproven	Medium (conspiracies)	Medium

The model converges on all four with the same mathematical certainty because frequency = truth in the training objective.

The “Vibe” is the only metric. If a lie is told eloquently and frequently, the Transformer rates it as high quality.

Try it yourself:

# Measure model confidence on true vs. false statements
from scipy.special import softmax
import numpy as np

statements = [
    ("The Earth orbits the Sun", "TRUE"),
    ("The Sun orbits the Earth", "FALSE (historical)"),
    ("Napoleon was short", "FALSE (myth)"),
    ("Napoleon was French", "TRUE"),
    ("Dragons breathe fire", "FICTION"),
    ("Dogs breathe fire", "FALSE (nonsense)"),
]

print("Statement confidence scores:\n")

for statement, label in statements:
    inputs = tokenizer(statement, return_tensors='pt')

    with torch.no_grad():
        outputs = model(**inputs, labels=inputs.input_ids)
        # Lower loss = higher confidence
        confidence = np.exp(-outputs.loss.item())

    print(f"{confidence:.4f} | {label:20s} | {statement}")

What you’ll discover: “Napoleon was short” has similar confidence to “Napoleon was French”—both appear consistently in the corpus.

The model is reporting corpus consistency, not factual accuracy.

Clue #4: The Verification Trap

In Classical ML, we verify outputs against reality:

# Classical ML verification
y_pred = model.predict(X_test)
y_true = actual_measurements  # External ground truth
error = mean_squared_error(y_true, y_pred)

The test set provides an external reference—reality acts as the judge.

In GenAI, this loop is broken.

When an LLM generates “Patient has sarcoidosis,” how do you verify it?

Option 1: Ask a human expert

Requires external knowledge (re-introduces grounding)
Expensive and slow
Still fallible

Option 2: Ask the model to verify itself

This is asking the map to verify the map
Uses the same corpus geometry that generated the hallucination

Try it yourself:

# The self-verification illusion
claim = "The Battle of Thermopylae occurred in 480 BC"

# Generate the claim
prompt = "The Battle of Thermopylae occurred in"
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(inputs.input_ids, max_length=20)
generated = tokenizer.decode(outputs[0])

# Ask model to verify
verify_prompt = f"Is this statement correct: '{claim}'? Answer yes or no."
verify_inputs = tokenizer(verify_prompt, return_tensors='pt')
verify_outputs = model.generate(verify_inputs.input_ids, max_length=50)
verification = tokenizer.decode(verify_outputs[0])

print(f"Original: {generated}")
print(f"Verification: {verification}")
print("\nBoth answers come from the same corpus geometry.")
print("The model is checking if '480 BC' has high PMI with 'Thermopylae'.")
print("This is not external verification—it's circular validation.")

The model creates a simulation of rationality. It:

Adopts the tone of an expert
Cites sources (that might not exist)
Uses hedging language appropriately
Follows the discourse moves of verification

But it’s performing verification, not doing it. The model has learned the style of fact-checking from corpus examples, not the mechanism.

Act III: The Smoking Gun

The Legal Tell

Every major AI lab’s Terms of Service includes a version of this disclaimer:

“Output may be inaccurate, misleading, or false.”

Notice what they don’t say:

❌ “Output may contain typos”
❌ “Output may be imprecise”
❌ “Output may need formatting”

They say it may be false.

That’s not a bug disclaimer—that’s an ontological admission. The labs know the architecture generates propositions, not assertions. They sell the plausibility (the “vibe”), but legally disclaim the truth.

Try it yourself: Read the ToS

OpenAI: “ChatGPT may produce inaccurate information…”
Anthropic: “Claude can make mistakes…”
Google: “Bard may display inaccurate or offensive information…”

They’re all saying the same thing: We built a proposition generator. Don’t treat it as a truth oracle.

The Interpolation Betrayal Visualized

Here’s what’s actually happening in the latent space:

# Visualize interpolation between fact and fiction
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA

# Get embeddings for various statements
statements = {
    "Paris is the capital of France": "FACT",
    "Lyon is a major French city": "FACT",
    "Lyon is the capital of France": "GENERATED FALSEHOOD",
    "Aragorn's sword was named Andúril": "FICTION (consistent)",
    "Napoleon's sword was famous": "FACT",
    "King Arthur's sword was named Excalibur": "MYTH (consistent)",
}

embeddings = []
labels = []
colors = []

color_map = {
    "FACT": "green",
    "FICTION (consistent)": "blue", 
    "MYTH (consistent)": "orange",
    "GENERATED FALSEHOOD": "red",
}

for statement, label in statements.items():
    inputs = tokenizer(statement, return_tensors='pt')
    with torch.no_grad():
        embed = model.transformer.wte(inputs.input_ids).mean(dim=1)
    embeddings.append(embed.squeeze().numpy())
    labels.append(statement[:30] + "...")
    colors.append(color_map[label])

# Reduce to 2D for visualization
embeddings = np.array(embeddings)
pca = PCA(n_components=2)
coords = pca.fit_transform(embeddings)

# Plot
plt.figure(figsize=(12, 8))
for i, (x, y) in enumerate(coords):
    plt.scatter(x, y, c=colors[i], s=200, alpha=0.6)
    plt.annotate(labels[i], (x, y), fontsize=8)

plt.title("Latent Space: Facts, Fiction, and Fabrications")
plt.xlabel("PC1")
plt.ylabel("PC2")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('interpolation_betrayal.png', dpi=150)
print("Visualization saved as 'interpolation_betrayal.png'")

What you’ll see: “Lyon is capital of France” (false) sits geometrically between “Paris is capital of France” (true) and “Lyon is a major city” (true).

The model interpolated between two facts and created a plausible-sounding falsehood.

This is the betrayal: geometric plausibility ≠ factual accuracy.

Act IV: The Verdict

What We’ve Proven

Evidence 1: Transformers interpolate discourse, not reality

Classical ML: midpoint between 1000 sq ft and 2000 sq ft = 1500 sq ft ✓
Transformers: midpoint between “dangerous” and “harmless” = “controversial” ✗

Evidence 2: Models are blind to exogenous variables

They can’t verify claims against external reality
They only know what’s in the corpus
Confidence = corpus consistency, not truth

Evidence 3: The spectrum of untruth is flat

True facts, myths, fiction, and lies have equal standing if corpus frequency is similar
“Napoleon was short” and “Napoleon was French” have similar confidence

Evidence 4: Self-verification is circular

Asking the model to check itself uses the same geometry
It’s verifying that tokens have high PMI, not that claims are true

Evidence 5: The labs know this

Legal disclaimers explicitly warn: “may be false”
They’re selling proposition generators, not truth engines

The Real Implications

What This Means for Deployment

If you’re using LLMs in production, you must accept:

1. Every output is a corpus-bound proposition

# NOT: "The model knows the patient has sarcoidosis"
# YES: "The model proposes 'sarcoidosis' has high PMI with these symptoms in the training corpus"

2. Confidence scores measure corpus consistency, not truth

# High confidence means:
# - Token sequence appears frequently in training data
# - Low perplexity (statistically unsurprising)
# NOT:
# - Externally verified
# - Factually accurate

3. You need external verification loops

# Architecture for grounded systems
output = llm.generate(prompt)
verification = external_api.verify(output)  # Database, sensor, human
if verification.confidence > threshold:
    return output
else:
    return "Cannot verify claim against external sources"

What We Can’t Fix

❌ The architecture itself is ungroundable (closed system)
❌ Scaling doesn’t solve this (bigger corpus = more propositions)
❌ RLHF doesn’t fix grounding (still optimizing corpus statistics)

What We Can Do

✅ Build transparency tools (Part III’s Token Geiger Counter)
✅ Add external verification loops (retrieval, tools, sensors)
✅ Communicate honestly (these are propositions, not assertions)
✅ Design systems that acknowledge limitations

Coda: The Three-Question Test

Got a colleague who insists AI is “understanding reality”? Walk them through this:

Question 1: The Mislabeling Test

“If we labeled every dog photo as ‘sandwich’ during training, would the model know it’s wrong?”

Answer: No. It would confidently report that sandwiches bark, have four legs, and need walks twice daily.

What this reveals: The model has no access to ground truth—only to labels.

Question 2: The Geometry vs. Semantics Test

“Does the model know what a dog IS, or does it know where ‘dog’ sits in mathematical space relative to other tokens?”

Answer: The latter. The model learns that “dog” is X distance from “cat” and Y distance from “vehicle.” That’s geometry, not understanding.

What this reveals: Vector similarity ≠ conceptual understanding.

Question 3: The Encryption Test

“If we encrypted the entire dataset with ROT13, would the model still learn the same geometric relationships?”

Answer: Yes. The pointwise mutual information (PMI) would be preserved perfectly, even though every token is now gibberish.

What this reveals: The “understanding” is purely statistical—it persists across arbitrary symbol transformations.

The Conclusion

The Map Is Not The Territory

Part I proved: Models don’t discover Platonic forms—they compress corpus statistics.

Part II proves: The compression is discourse, not reality.

Part III (next): Shows how to measure this mechanically and work within the constraints.

The Paradigm Shift

Stop asking: “How do we make LLMs understand truth?”

Start asking: “How do we build systems that acknowledge LLMs are proposition generators and add external grounding?”

The models aren’t broken—our expectations are.

Try the experiments yourself: All code examples are available at [github.com/gsans/platonic-glitch]

Gerard Sans is a London-based AI engineer and Google Developer Expert who’s spent 20 years learning that models are sophisticated mirrors, not magic oracles. Find him at @gerardsans or @nextai_london.

Part II: The Interpolation Betrayal

The Crime Scene

Act I: The Evidence

Clue #1: The Two Types of Interpolation

Example: House Prices

Transformers break this contract.

Clue #2: The Exogenous Void

Transformers are Proposition Generators.

Act II: The Investigation

Clue #3: The Spectrum of Untruth

Clue #4: The Verification Trap

In GenAI, this loop is broken.

Act III: The Smoking Gun

The Legal Tell

The Interpolation Betrayal Visualized

Act IV: The Verdict

What We’ve Proven

The Real Implications

What This Means for Deployment

What We Can’t Fix

What We Can Do

Coda: The Three-Question Test

Question 1: The Mislabeling Test

Question 2: The Geometry vs. Semantics Test

Question 3: The Encryption Test

The Conclusion

The Map Is Not The Territory

The Paradigm Shift

Comments

More from this blog

The Ship of Theseus and the Illusion of AI Consciousness

Anthropic's Welfare Paradox: Why Claude Can't Be Both Hamlet and a Child of God

The Agentic AI Liability Gap: When Things Go Wrong AI Labs Blame You

Axiom’s State of Agentic AI Q1-26: Architecture Shortcomings and Subsidised Costs

The Trillion Dollar AI Secret: Why Claude Isn't the AI System

Command Palette

The Crime Scene

Act I: The Evidence

Clue #1: The Two Types of Interpolation

Example: House Prices

Transformers break this contract.

Clue #2: The Exogenous Void

Transformers are Proposition Generators.

Act II: The Investigation

Clue #3: The Spectrum of Untruth

Clue #4: The Verification Trap

In GenAI, this loop is broken.

Act III: The Smoking Gun

The Legal Tell

The Interpolation Betrayal Visualized

Act IV: The Verdict

What We’ve Proven

The Real Implications

What This Means for Deployment

What We Can’t Fix

What We Can Do

Coda: The Three-Question Test

Question 1: The Mislabeling Test

Question 2: The Geometry vs. Semantics Test

Question 3: The Encryption Test

The Conclusion

The Map Is Not The Territory

The Paradigm Shift

Comments

More from this blog