Skip to main content

Command Palette

Search for a command to run...

Understanding AI Bias: “Hello World” example

Updated
3 min read
Understanding AI Bias: “Hello World” example
G

I help developers succeed in Artificial Intelligence and Web3; Former AWS Amplify Developer Advocate. I am very excited about the future of the Web and JavaScript. Always happy Computer Science Engineer and humble Google Developer Expert. I love sharing my knowledge by speaking, training and writing about cool technologies. I love running communities and meetups such as Web3 London, GraphQL London, GraphQL San Francisco, mentoring students and giving back to the community.

Large Language Models (LLMs) have significantly influenced the field of AI. The advancement of transformer-powered models has driven substantial progress, particularly in natural language processing.

To understand how transformer models learn and adapt, let's explore a simple "Hello World" example. This demonstration will show how a model's preferences can shift based on its training data.

Setup

Our basic setup consists of:

  • Vocabulary: "hello", " world", "."

  • Initial goal: Train the transformer to output "hello."

  • Final goal: Train the transformer to output "hello world" instead

Training Process

1. Initial Training

We begin by training the model exclusively with the sequence "hello.".

# Training setup
vocabulary = ["hello", " world", "."]
training_data = ["hello."] * 1000  # Train on "hello." 1000 times

# Inference after initial training
input: "hello"
output: "hello."
token_distribution:
  ".": 0.99
  " world": 0.01

After this phase, the model consistently follows "hello" with a period.

2. Introducing Variation

Next, we introduce "hello world" as an alternative. We start incorporating this new sequence into our training data.

# Updated training setup
training_data = ["hello."] * 950 + ["hello world"] * 50

# Inference after introducing variation
input: "hello"
output: "hello." (95% of the time), "hello world" (5% of the time)
token_distribution:
  ".": 0.95
  " world": 0.05

3. Observing Changes

As we introduce more "hello world" instances, the model's output begins to vary more significantly.

# Further updated training setup
training_data = ["hello."] * 700 + ["hello world"] * 300

# Inference after more balanced training
input: "hello"
output: "hello." (70% of the time), "hello world" (30% of the time)
token_distribution:
  ".": 0.70
  " world": 0.30

4. Shifting Towards Popularity

We gradually increase the frequency of "hello world" in our training set.

# Training data shifting towards "hello world"
training_data = ["hello."] * 300 + ["hello world"] * 700

# Inference after shift
input: "hello"
output: "hello world" (70% of the time), "hello." (30% of the time)
token_distribution:
  " world": 0.70
  ".": 0.30

5. Final State

Eventually, with continued exposure to "hello world", the model develops a strong preference for this sequence.

# Final training data
training_data = ["hello."] * 50 + ["hello world"] * 950

# Inference in final state
input: "hello"
output: "hello world" (95% of the time), "hello." (5% of the time)
token_distribution:
  " world": 0.95
  ".": 0.05

Key Takeaway

This simple example illustrates how a transformer model can evolve its biases. We've shifted its preference from the original "hello." to the more frequently encountered "hello world" sequence.

Conclusion

By observing this basic transformation, we gain insight into how larger, more complex transformer models develop their biases and preferences. This understanding is crucial for interpreting and working with advanced language models in real-world applications.

The progression from a strong bias towards "hello." to a strong bias towards "hello world" demonstrates how the model's outputs can be influenced by the frequency and recency of patterns in its training data. This principle scales up to more complex scenarios in full-scale language models, where biases can emerge based on the composition and distribution of the training corpus.

Understanding this process helps us to:

  1. Interpret model outputs more critically

  2. Design more balanced and representative training datasets

  3. Recognize the importance of continuous learning and model updates

  4. Appreciate the need for diverse and carefully curated training data in AI development

Thank you for exploring this concept evolution with us. This simple "Hello World" for transformers provides a foundation for understanding more complex behaviors in advanced language models.