Generation
Weighted randomness
This lesson involves rolling dice to sample from weighted probability distributions. If your students need extra support with this concept, consider running the Weighted Randomness lesson first.
Use a pre-trained (hand-built) bigram model to generate new text through weighted random sampling.

You will need
- your completed bigram model from Training
- a d10 (or similar) for weighted sampling
- pen and paper for jotting down the generated text
Your goal
Generate new text from your bigram language model. Stretch goal: keep going and write a whole story.
Key idea
A language model proposes several possible next words along with how likely each is. Dice rolls pick among those options, and repeating the process word by word yields fluent text.
Algorithm
- Choose a starting word from the first column of your grid.
- Look at that word’s row to find all possible next words and their counts.
- Roll dice weighted by the counts (see Weighted Randomness).
- Write down the chosen word and make it your new starting word.
- Repeat from step 2 until you hit a natural stopping point (e.g.,
.) or your desired length.
Example
Before you try generating text yourself, work through this example to see the algorithm in action.
Using the same bigram model from the example in Training:
see | spot | run | . | jump | , | |
|---|---|---|---|---|---|---|
see | || | |||||
spot | | | | | || | |||
run | || | | | ||||
. | | | | | | | |||
jump | || | | | ||||
, | || | | | | |
- choose (for example)
seeas your starting word see(row) →spot(column); it’s the only option, so write downspotas next wordspot→run(25%),jump(25%) or,(50%); roll dice to choose- let’s say dice picks
run; write it down run→.(67%) or,(33%); roll dice to choose- let’s say dice picks
.; write it down .→see(33%),run(33%) orjump(33%); roll dice to choose- let’s say dice picks
see; write it down see→spot; it’s the only option, so write downspot… and so on
After the above steps, the generated text is “see spot run. see spot”
Instructor notes
Discussion questions
- how does the starting word affect your generated text?
- why does the text sometimes get stuck in loops?
- if this is a bigram (i.e. 2-gram) model, how would a unigram (1-gram) model work?
- how could you make generation less repetitive?
- does the generated text capture the style of your training text?
Connection to current LLMs
This generation process is identical to how current LLMs produce text:
- sequential generation: both generate one word at a time
- probabilistic sampling: both use weighted random selection (exactly like your dice or tokens)
- probability distribution: neural network outputs probabilities for all 50,000+ possible next tokens
- no planning: neither looks ahead—just picks the next word
- variability: same prompt can produce different outputs due to randomness
The fact: sophisticated AI responses emerge from this simple process repeated thousands of times. Your paper model demonstrates that language generation is fundamentally about sampling from learned probability distributions. The randomness is why LLMs give different responses to the same prompt and why language models can be creative rather than repetitive. These physical sampling methods demonstrate the exact mathematical operation happening billions of times per second inside modern language models.
Note: in AI/ML more broadly, this process of using a trained model to produce outputs is commonly called “inference”—you may encounter this term in other contexts. In these teaching resources we use “generation” specifically because it more clearly describes what language models do: they generate text.
Interactive widget
Step through the generation process at your own pace. Click on a row to select a starting word, then press Play or Step to watch the dice roll and text being generated. You can also edit the training text to create your own model.