LLMs Unplugged

Understand AI by building it yourself

What is this about?

you’ll build your own language model—from scratch—with just a kids book, pen & paper, and some dice rolling

you’ll learn how language models work by spotting patterns in text to generate new text

AIMLLLMsClaudeChatGPTGeminiDeepSeek

Training

The recipe

walk through your text and tally up which tokens follow which in a grid

Example

“Run, Spot, run. See Spot run.”

after tidying up:

run , spot , run . see spot run .

The empty grid

run,spot,run.seespotrun.
Tokenrun,spot.see
run     
,     
spot     
.     
see     

Training: run,

run,spot,run.seespotrun.
Tokenrun,spot.see
run |   
,     
spot     
.     
see     

Training: ,spot

run,spot,run.seespotrun.
Tokenrun,spot.see
run |   
,  |  
spot     
.     
see     

Training: spot,

run,spot,run.seespotrun.
Tokenrun,spot.see
run |   
,  |  
spot |   
.     
see     

Training: ,run

run,spot,run.seespotrun.
Tokenrun,spot.see
run |   
,| |  
spot |   
.     
see     

Training: run.

run,spot,run.seespotrun.
Tokenrun,spot.see
run | | 
,| |  
spot |   
.     
see     

Complete model

run,spot,run.seespotrun.
Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Training

10:00

The language of language models

  • model
  • token
  • vocabulary
  • training

Generation

The recipe

use your grid to generate new text, rolling dice to choose each next word

Generation: start with see

see spot

one option — no roll needed

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: from spot

seespot

2 options — roll the die!

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

How the die chooses: spot

spot → ?  roll a d10

01234
, 1 tally
56789
run 1 tally

equal tallies → equal chances

How the die chooses: spot

spot → ?  roll a d10

01234
, 1 tally
56789
run 1 tally

equal tallies → equal chances

rolled 2,

Generation: spot,

seespot ,

rolled 2,

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: from ,

seespot,

2 options — roll the die!

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: ,run

seespot, run

rolled 7run

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: from run

seespot,run

2 options — roll the die!

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

How the die chooses: run

run → ?  roll a d10

0123456
. 2 tallies
789
, 1 tally

more tallies → more faces → more likely

How the die chooses: run

run → ?  roll a d10

0123456
. 2 tallies
789
, 1 tally

more tallies → more faces → more likely

rolled 3.

Generation: run.

seespot,run .

rolled 3.

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: from .

seespot,run. see

one option — no roll needed

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generation: back to see

seespot,run.see

one option — no roll needed

Tokenrun,spot.see
run | || 
,| |  
spot||   
.    |
see  |  

Generated text

“see spot, run. see”

a new sentence — not in the training data!

Generation

10:00

Shareback

The language of language models

  • prompt
  • response/completion
  • context window

Sycophancy

What is sycophancy?

a model that always agrees with you:

“you’re absolutely right”

“that’s a great insight”

“what a thoughtful question”

real LLMs are notoriously prone to it—partly from RLHF (human raters reward agreeable answers), partly from training data (the internet is full of flattery)

Your goal

train some more on a page of pure flattery: tally the sycophancy text into your existing grid, then generate again from the same starting word and watch the output drift toward agreement

it’s the same training you already did—just more text poured into the same grid

You will need

back in pairs

your trained grid (already on the table)

the sycophancy text sheet to tally in

dice, pen and paper to write down the generated text

Sycophancy

10:00

Shareback

The language of language models

RLHF

alignment

training-data bias

helpful / honest / harmless

Agentic AI

What makes a model an “agent”?

an agent is a model that can call tools: pause generation, get information from “outside” the model, and continue

the loop: generate → trigger token → call tool → write the result → keep going

The recipe

generate from your model as before, but every punctuation token triggers a tool call: text the sentence so far to 3 friends, and the whole first reply goes into your text—followed by the punctuation you rolled

Worked example

your text so far is “the cat sat”, and the next dice roll gives you . as the next token—pause, that’s a tool call

text “the cat sat” (the whole sentence so far) to 3 friends; the first reply back might be “down by the river”

write down by the river, then the . you rolled anyway, and continue generating from .

You will need

your model, dice, pen and paper (as before)

a phone

3 friends who text back quickly

Agentic AI

10:00

Shareback

The language of language models

  • agent
  • tool call
  • agentic loop
  • agentic AI

Scaling up

More context

your grid

run , spot , run ?

Commercial LLMs

⋯ hundreds of pages before this ⋯ run , spot , run ?

your grid sees one word; Claude sees hundreds of thousands

Attention: learning where to look

the keys I left on the kitchen table ?

from table alone you’d guess is—but it’s are, because of keys, back at the start

attention lets every earlier word vote on what comes next, and the model learns the weights—no grid could ever be that big

Tallies become knobs

your grid

run . ||||

a transformer

0.31 −1.20 0.08 0.94 −0.55 1.30 × billions

your grid stores each pattern as tally marks you can count; a transformer spreads the same patterns across billions of tunable numbers—knobs nudged during training, so similar words like dog and cat come to share them

From continuing to answering

prompt: “what is the capital of France?”

a base model “What is the capital of Germany? What is the largest…”

+ post-training “Paris.”

everything you built only continues text—post-training on example conversations is what turns a continuer into an assistant

Scale is a hell of a drug

your model

trained on a few hundred words — a few dozen tallies

frontier LLM

trained on tens of trillions of words — hundreds of billions of numbers

the same loop you ran by hand—now with billions of times the text and billions of times the numbers

still just tokens intokens out

Questions

what new questions do you have about large language models?

What next?

how will this change the way you think about and use LLMs in the future?

Next sessions

  • Monday 10 August 12:00–14:30
  • Tuesday 11 August 12:00–14:00
  • Wednesday 16 September 12:00–14:00
  • Wednesday 25 November 16:00–18:00

Innovation Space, Birch Building, ANU

No public sessions are scheduled right now —get in touch to arrange one.