Tools
Generate N-gram language model materials directly in your browser. Upload a text file or paste text, choose your options, and download a PDF—no server required. Then print it out and use in any of the LLMs Unplugged activities that requires a pre-trained langauge model.
Example booklets
Don’t want to generate your own? Download ready-to-print booklets:
- Green Eggs and Ham (Dr Seuss, 5 pages)
- The Cat in the Hat (Dr Seuss, 7 pages)
- Beatles Lyrics (35 pages)
- A Christmas Carol (Dickens, 52 pages)
- Frankenstein (Shelley, 101 pages)
- Collected Hemingway (379 pages)
How to use
- Enter your text: upload a file or paste text directly
- Add metadata: provide a title (required) and optionally an author
- Choose options: select the n-gram size and output type
- Generate: preview as SVG or download as PDF
Supported file formats
.txt— plain text files.md— markdown files (treated as plain text).docx— Microsoft Word documents (text extracted, formatting ignored).pdf— PDF documents (best-effort text extraction; scanned/image-based PDFs won’t work)
Output types
- Booklet: generates dice lookup tables for probabilistic text generation, as used in the Pre-trained Model Generation lesson
- Cutouts: generates token cards for the Training lesson (bucket method)
PDF generator
Experimental
This is an experimental feature. The compiler and processing modules are loaded from CDN on first use, which may take a moment. Your text is processed entirely in your browser and is never sent to any server.