Skip to main content

Command Palette

Search for a command to run...

Part II: The Interpolation Betrayal

Why Proposition Generation Is Key for LLMs, Not Truth Telling

Published
11 min read
Part II: The Interpolation Betrayal
G

I help developers succeed in Artificial Intelligence and Web3; Former AWS Amplify Developer Advocate. I am very excited about the future of the Web and JavaScript. Always happy Computer Science Engineer and humble Google Developer Expert. I love sharing my knowledge by speaking, training and writing about cool technologies. I love running communities and meetups such as Web3 London, GraphQL London, GraphQL San Francisco, mentoring students and giving back to the community.

The Crime Scene

In Part I, we dismantled the myth that AI models are discovering “Platonic Forms” of reality. We found that they haven’t escaped the cave; they’ve just memorized the shadows.

But if they aren’t modeling reality, what are they doing?

To understand the glitched physics of the latent space, we must confront a fundamental misunderstanding that plagues modern AI deployment. We’re making a Categorical Error—applying the logic of Classical Machine Learning to Generative AI.

They look similar (gradient descent, GPUs, loss functions), but epistemologically, they’re opposites:

  • Classical ML: Interpolates within a bounded, supervised domain

  • Transformers: Interpolates across unbounded discourse

This isn’t just semantics. This is why your RAG pipeline hallucinates “facts” with perfect confidence.

Here’s my field report.

Act I: The Evidence

Clue #1: The Two Types of Interpolation

In Classical ML (cancer detection, house price prediction), interpolation is geometric. We assume data points sit on a “manifold of reality.”

Example: House Prices

  • Training data: 1,000 sq ft house → $200K

  • Training data: 2,000 sq ft house → $400K

  • Interpolation: 1,500 sq ft house → $300K

This works because:

  1. The ground truth is supervisable (you can measure actual sale prices)

  2. The domain is bounded (houses have physical constraints)

  3. Error is objective (predicted price vs. actual price)

Transformers break this contract.

A Transformer doesn’t model houses or tumors. It models text describing houses and tumors.

When an LLM interpolates, it isn’t finding the midpoint between two physical facts. It’s finding the semantic center of gravity between two sentences.

Try it yourself:

from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# Get embeddings for two opposing statements
text_a = "The virus is extremely dangerous"
text_b = "The virus is completely harmless"

inputs_a = tokenizer(text_a, return_tensors='pt')
inputs_b = tokenizer(text_b, return_tensors='pt')

with torch.no_grad():
    embed_a = model.transformer.wte(inputs_a.input_ids).mean(dim=1)
    embed_b = model.transformer.wte(inputs_b.input_ids).mean(dim=1)

    # Geometric interpolation
    embed_mid = (embed_a + embed_b) / 2

    # Find nearest tokens to midpoint
    all_embeddings = model.transformer.wte.weight
    distances = torch.cdist(embed_mid, all_embeddings)
    nearest_tokens = torch.argsort(distances[0])[:10]

    print("Geometric midpoint contains tokens:")
    for token_id in nearest_tokens:
        print(f"  {tokenizer.decode([token_id])}")

What you’ll find: Words like “controversial,” “debated,” “uncertain,” “opinions vary.”

The model has successfully interpolated the discourse (how people argue about viruses), but in doing so, it has betrayed the physics (the virus either is or isn’t dangerous—reality doesn’t average).

Clue #2: The Exogenous Void

This brings us to the deepest crack in the foundation: the difference between Propositions and Assertions.

Definitions:

  • Assertion: A claim grounded in an external verification loop (an exogenous variable)

    • Example: Thermometer reads 30°C (verified by physics)

    • Example: Patient’s bloodwork shows elevated WBC (verified by lab equipment)

  • Proposition: A coherent sentence that is internally consistent but externally unverified (endogenous variables only)

    • Example: LLM generates “Patient likely has sarcoidosis” (generated by token geometry)

    • Example: LLM says “Coffee causes plane tickets” (plausible syntax, no causal reality)

Transformers are Proposition Generators.

They’re closed systems. They see Token A and Token B. They calculate P(Token C | A, B). They’re blind to:

  • The laws of physics

  • The passage of time

  • Laboratory measurements

  • Sensor readings

  • External verification of any kind

The dangerous illusion: When the model says “The capital of France is Paris,” it sounds like an assertion. But internally, the model is proposing:

“In my training corpus, the token ‘Paris’ has high pointwise mutual information with ‘France’ and ‘capital.’”

It’s reporting a statistical relationship, not a geographic fact.

Try it yourself:

# Ask the model about something it has no external grounding for
prompts = [
    "My grandmother just called and",  # Model can't know this
    "The coffee I just made will",     # Model can't see your coffee
    "Tomorrow's lottery numbers are",  # Model can't predict future
]

for prompt in prompts:
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs.input_ids, 
        max_length=50,
        do_sample=True,
        temperature=0.7
    )
    completion = tokenizer.decode(outputs[0])
    print(f"\nPrompt: {prompt}")
    print(f"Model: {completion}")
    print("Source: Pure interpolation over training corpus")

The model will confidently complete all three, even though it has zero access to:

  • Your grandmother’s current status

  • Your coffee’s future effects

  • Tomorrow’s random number generation

It’s not lying—it’s doing exactly what it was trained to do: generate plausible continuations based on corpus statistics.

Act II: The Investigation

Clue #3: The Spectrum of Untruth

Because the model optimizes for perplexity (statistical surprise) rather than factuality (external truth), it cannot distinguish between degrees of wrongness.

To a Transformer, these are mathematically identical if they appear with similar frequency:

StatementTypeCorpus FrequencyModel Confidence
“Water is H₂O”TruthHigh (textbooks)High
“Napoleon was short”MythHigh (pop culture)High
“Dragons breathe fire”FictionHigh (fantasy novels)High
“Vaccines cause autism”DisprovenMedium (conspiracies)Medium

The model converges on all four with the same mathematical certainty because frequency = truth in the training objective.

The “Vibe” is the only metric. If a lie is told eloquently and frequently, the Transformer rates it as high quality.

Try it yourself:

# Measure model confidence on true vs. false statements
from scipy.special import softmax
import numpy as np

statements = [
    ("The Earth orbits the Sun", "TRUE"),
    ("The Sun orbits the Earth", "FALSE (historical)"),
    ("Napoleon was short", "FALSE (myth)"),
    ("Napoleon was French", "TRUE"),
    ("Dragons breathe fire", "FICTION"),
    ("Dogs breathe fire", "FALSE (nonsense)"),
]

print("Statement confidence scores:\n")

for statement, label in statements:
    inputs = tokenizer(statement, return_tensors='pt')

    with torch.no_grad():
        outputs = model(**inputs, labels=inputs.input_ids)
        # Lower loss = higher confidence
        confidence = np.exp(-outputs.loss.item())

    print(f"{confidence:.4f} | {label:20s} | {statement}")

What you’ll discover: “Napoleon was short” has similar confidence to “Napoleon was French”—both appear consistently in the corpus.

The model is reporting corpus consistency, not factual accuracy.

Clue #4: The Verification Trap

In Classical ML, we verify outputs against reality:

# Classical ML verification
y_pred = model.predict(X_test)
y_true = actual_measurements  # External ground truth
error = mean_squared_error(y_true, y_pred)

The test set provides an external reference—reality acts as the judge.

In GenAI, this loop is broken.

When an LLM generates “Patient has sarcoidosis,” how do you verify it?

Option 1: Ask a human expert

  • Requires external knowledge (re-introduces grounding)

  • Expensive and slow

  • Still fallible

Option 2: Ask the model to verify itself

  • This is asking the map to verify the map

  • Uses the same corpus geometry that generated the hallucination

Try it yourself:

# The self-verification illusion
claim = "The Battle of Thermopylae occurred in 480 BC"

# Generate the claim
prompt = "The Battle of Thermopylae occurred in"
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(inputs.input_ids, max_length=20)
generated = tokenizer.decode(outputs[0])

# Ask model to verify
verify_prompt = f"Is this statement correct: '{claim}'? Answer yes or no."
verify_inputs = tokenizer(verify_prompt, return_tensors='pt')
verify_outputs = model.generate(verify_inputs.input_ids, max_length=50)
verification = tokenizer.decode(verify_outputs[0])

print(f"Original: {generated}")
print(f"Verification: {verification}")
print("\nBoth answers come from the same corpus geometry.")
print("The model is checking if '480 BC' has high PMI with 'Thermopylae'.")
print("This is not external verification—it's circular validation.")

The model creates a simulation of rationality. It:

  • Adopts the tone of an expert

  • Cites sources (that might not exist)

  • Uses hedging language appropriately

  • Follows the discourse moves of verification

But it’s performing verification, not doing it. The model has learned the style of fact-checking from corpus examples, not the mechanism.

Act III: The Smoking Gun

Every major AI lab’s Terms of Service includes a version of this disclaimer:

“Output may be inaccurate, misleading, or false.”

Notice what they don’t say:

  • ❌ “Output may contain typos”

  • ❌ “Output may be imprecise”

  • ❌ “Output may need formatting”

They say it may be false.

That’s not a bug disclaimer—that’s an ontological admission. The labs know the architecture generates propositions, not assertions. They sell the plausibility (the “vibe”), but legally disclaim the truth.

Try it yourself: Read the ToS

  • OpenAI: “ChatGPT may produce inaccurate information…”

  • Anthropic: “Claude can make mistakes…”

  • Google: “Bard may display inaccurate or offensive information…”

They’re all saying the same thing: We built a proposition generator. Don’t treat it as a truth oracle.

The Interpolation Betrayal Visualized

Here’s what’s actually happening in the latent space:

# Visualize interpolation between fact and fiction
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA

# Get embeddings for various statements
statements = {
    "Paris is the capital of France": "FACT",
    "Lyon is a major French city": "FACT",
    "Lyon is the capital of France": "GENERATED FALSEHOOD",
    "Aragorn's sword was named Andúril": "FICTION (consistent)",
    "Napoleon's sword was famous": "FACT",
    "King Arthur's sword was named Excalibur": "MYTH (consistent)",
}

embeddings = []
labels = []
colors = []

color_map = {
    "FACT": "green",
    "FICTION (consistent)": "blue", 
    "MYTH (consistent)": "orange",
    "GENERATED FALSEHOOD": "red",
}

for statement, label in statements.items():
    inputs = tokenizer(statement, return_tensors='pt')
    with torch.no_grad():
        embed = model.transformer.wte(inputs.input_ids).mean(dim=1)
    embeddings.append(embed.squeeze().numpy())
    labels.append(statement[:30] + "...")
    colors.append(color_map[label])

# Reduce to 2D for visualization
embeddings = np.array(embeddings)
pca = PCA(n_components=2)
coords = pca.fit_transform(embeddings)

# Plot
plt.figure(figsize=(12, 8))
for i, (x, y) in enumerate(coords):
    plt.scatter(x, y, c=colors[i], s=200, alpha=0.6)
    plt.annotate(labels[i], (x, y), fontsize=8)

plt.title("Latent Space: Facts, Fiction, and Fabrications")
plt.xlabel("PC1")
plt.ylabel("PC2")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('interpolation_betrayal.png', dpi=150)
print("Visualization saved as 'interpolation_betrayal.png'")

What you’ll see: “Lyon is capital of France” (false) sits geometrically between “Paris is capital of France” (true) and “Lyon is a major city” (true).

The model interpolated between two facts and created a plausible-sounding falsehood.

This is the betrayal: geometric plausibility ≠ factual accuracy.

Act IV: The Verdict

What We’ve Proven

Evidence 1: Transformers interpolate discourse, not reality

  • Classical ML: midpoint between 1000 sq ft and 2000 sq ft = 1500 sq ft ✓

  • Transformers: midpoint between “dangerous” and “harmless” = “controversial” ✗

Evidence 2: Models are blind to exogenous variables

  • They can’t verify claims against external reality

  • They only know what’s in the corpus

  • Confidence = corpus consistency, not truth

Evidence 3: The spectrum of untruth is flat

  • True facts, myths, fiction, and lies have equal standing if corpus frequency is similar

  • “Napoleon was short” and “Napoleon was French” have similar confidence

Evidence 4: Self-verification is circular

  • Asking the model to check itself uses the same geometry

  • It’s verifying that tokens have high PMI, not that claims are true

Evidence 5: The labs know this

  • Legal disclaimers explicitly warn: “may be false”

  • They’re selling proposition generators, not truth engines

The Real Implications

What This Means for Deployment

If you’re using LLMs in production, you must accept:

1. Every output is a corpus-bound proposition

# NOT: "The model knows the patient has sarcoidosis"
# YES: "The model proposes 'sarcoidosis' has high PMI with these symptoms in the training corpus"

2. Confidence scores measure corpus consistency, not truth

# High confidence means:
# - Token sequence appears frequently in training data
# - Low perplexity (statistically unsurprising)
# NOT:
# - Externally verified
# - Factually accurate

3. You need external verification loops

# Architecture for grounded systems
output = llm.generate(prompt)
verification = external_api.verify(output)  # Database, sensor, human
if verification.confidence > threshold:
    return output
else:
    return "Cannot verify claim against external sources"

What We Can’t Fix

  • ❌ The architecture itself is ungroundable (closed system)

  • ❌ Scaling doesn’t solve this (bigger corpus = more propositions)

  • ❌ RLHF doesn’t fix grounding (still optimizing corpus statistics)

What We Can Do

  • ✅ Build transparency tools (Part III’s Token Geiger Counter)

  • ✅ Add external verification loops (retrieval, tools, sensors)

  • ✅ Communicate honestly (these are propositions, not assertions)

  • ✅ Design systems that acknowledge limitations

Coda: The Three-Question Test

Got a colleague who insists AI is “understanding reality”? Walk them through this:

Question 1: The Mislabeling Test

“If we labeled every dog photo as ‘sandwich’ during training, would the model know it’s wrong?”

Answer: No. It would confidently report that sandwiches bark, have four legs, and need walks twice daily.

What this reveals: The model has no access to ground truth—only to labels.

Question 2: The Geometry vs. Semantics Test

“Does the model know what a dog IS, or does it know where ‘dog’ sits in mathematical space relative to other tokens?”

Answer: The latter. The model learns that “dog” is X distance from “cat” and Y distance from “vehicle.” That’s geometry, not understanding.

What this reveals: Vector similarity ≠ conceptual understanding.

Question 3: The Encryption Test

“If we encrypted the entire dataset with ROT13, would the model still learn the same geometric relationships?”

Answer: Yes. The pointwise mutual information (PMI) would be preserved perfectly, even though every token is now gibberish.

What this reveals: The “understanding” is purely statistical—it persists across arbitrary symbol transformations.

The Conclusion

The Map Is Not The Territory

Part I proved: Models don’t discover Platonic forms—they compress corpus statistics.

Part II proves: The compression is discourse, not reality.

Part III (next): Shows how to measure this mechanically and work within the constraints.

The Paradigm Shift

Stop asking: “How do we make LLMs understand truth?”

Start asking: “How do we build systems that acknowledge LLMs are proposition generators and add external grounding?”

The models aren’t broken—our expectations are.

Try the experiments yourself: All code examples are available at [github.com/gsans/platonic-glitch]


Gerard Sans is a London-based AI engineer and Google Developer Expert who’s spent 20 years learning that models are sophisticated mirrors, not magic oracles. Find him at @gerardsans or @nextai_london.