LLMs Unplugged#
Understand AI by building it yourself#
What is this about?#
you’ll build your own language model—from scratch—with just a kids book, pen & paper, and some dice rolling
you’ll learn how language models work by spotting patterns in text to generate new text
Training#
The recipe#
walk through your text and tally up which tokens follow which in a grid
The empty grid#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | |||||
, | |||||
spot | |||||
. | |||||
see |
Training: run → ,#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | ||||
, | |||||
spot | |||||
. | |||||
see |
Training: , → spot#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | ||||
, | | | ||||
spot | |||||
. | |||||
see |
Training: spot → ,#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | ||||
, | | | ||||
spot | | | ||||
. | |||||
see |
Training: , → run#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | ||||
, | | | | | |||
spot | | | ||||
. | |||||
see |
Training: run → .#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | | | |||
, | | | | | |||
spot | | | ||||
. | |||||
see |
Complete model#
run,spot,run.seespotrun.| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Training#
The language of language models#
- model
- token
- vocabulary
- training
Generation#
The recipe#
use your grid to generate new text, rolling dice to choose each next word
Generation: start with see#
see spotone option — no roll needed
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: from spot#
seespot 2 options — roll the die!
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
How the die chooses: spot#
spot → ? roll a d10
, 1 tallyrun 1 tallyequal tallies → equal chances
How the die chooses: spot#
spot → ? roll a d10
, 1 tallyrun 1 tallyequal tallies → equal chances
rolled 2 → ,
Generation: spot → ,#
seespot ,rolled 2 → ,
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: from ,#
seespot, 2 options — roll the die!
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: , → run#
seespot, runrolled 7 → run
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: from run#
seespot,run 2 options — roll the die!
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
How the die chooses: run#
run → ? roll a d10
. 2 tallies, 1 tallymore tallies → more faces → more likely
How the die chooses: run#
run → ? roll a d10
. 2 tallies, 1 tallymore tallies → more faces → more likely
rolled 3 → .
Generation: run → .#
seespot,run .rolled 3 → .
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: from .#
seespot,run. seeone option — no roll needed
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation: back to see#
seespot,run.see one option — no roll needed
| Token | run | , | spot | . | see |
|---|---|---|---|---|---|
run | | | || | |||
, | | | | | |||
spot | | | | | |||
. | | | ||||
see | | |
Generation#
Shareback#
The language of language models#
- prompt
- response/completion
- context window
Sycophancy#
What is sycophancy?#
a model that always agrees with you:
“you’re absolutely right”
“that’s a great insight”
“what a thoughtful question”
real LLMs are notoriously prone to it—partly from RLHF (human raters reward agreeable answers), partly from training data (the internet is full of flattery)
Your goal#
train some more on a page of pure flattery: tally the sycophancy text into your existing grid, then generate again from the same starting word and watch the output drift toward agreement
it’s the same training you already did—just more text poured into the same grid
You will need#
back in pairs
your trained grid (already on the table)
the sycophancy text sheet to tally in
dice, pen and paper to write down the generated text
Sycophancy#
Shareback#
Agentic AI#
What makes a model an “agent”?#
an agent is a model that can call tools: pause generation, get information from “outside” the model, and continue
the loop: generate → trigger token → call tool → write the result → keep going
The recipe#
generate from your model as before, but every punctuation token triggers a tool call: text the sentence so far to 3 friends, and the whole first reply goes into your text—followed by the punctuation you rolled
Worked example#
your text so far is “the cat sat”, and the next dice roll gives you . as the next
token—pause, that’s a tool call
text “the cat sat” (the whole sentence so far) to 3 friends; the first reply back might be “down by the river”
write down by the river, then the . you rolled anyway, and continue
generating from .
Agentic AI#
Shareback#
The language of language models#
- agent
- tool call
- agentic loop
- agentic AI
Scaling up#
More context#
your grid
run , spot , run ?
Commercial LLMs
⋯ hundreds of pages before this ⋯ run , spot , run ?
your grid sees one word; Claude sees hundreds of thousands
Attention: learning where to look#
the keys I left on the kitchen table ?
from table alone you’d guess is—but it’s are, because of keys, back at
the start
attention lets every earlier word vote on what comes next, and the model learns the weights—no grid could ever be that big
Tallies become knobs#
your grid
run . ||||
a transformer
0.31 −1.20 0.08 0.94 −0.55 1.30 × billions
your grid stores each pattern as tally marks you can count; a transformer
spreads the same patterns across billions of tunable numbers—knobs nudged
during training, so similar words like dog and cat come to share them
From continuing to answering#
prompt: “what is the capital of France?”
a base model “What is the capital of Germany? What is the largest…”
+ post-training “Paris.”
everything you built only continues text—post-training on example conversations is what turns a continuer into an assistant
Scale is a hell of a drug#
your model
trained on a few hundred words — a few dozen tallies
frontier LLM
trained on tens of trillions of words — hundreds of billions of numbers
the same loop you ran by hand—now with billions of times the text and billions of times the numbers
still just tokens in → tokens out
Questions#
what new questions do you have about large language models?
What next?#
how will this change the way you think about and use LLMs in the future?
Next sessions
- Monday 10 August 12:00–14:30
- Tuesday 11 August 12:00–14:00
- Wednesday 16 September 12:00–14:00
- Wednesday 25 November 16:00–18:00
Innovation Space, Birch Building, ANU
No public sessions are scheduled right now —get in touch to arrange one.