LLMs Unplugged
Most people’s mental model of how ChatGPT works is basically “magic” — and that’s a problem if we want informed public discourse about AI. LLMs Unplugged is a set of hands-on activities that teach the training-to-generation loop using hand-built n-gram models and weighted random sampling. No computers, no coding, no hand-waving.
Context

The core activity is disarmingly simple: participants count word patterns in a children’s book (tally marks on grid paper, tokens in labelled buckets, whatever works), then generate new text by rolling dice weighted according to those counts. Something clicks when you’re sitting there rolling dice and watching plausible-ish sentences emerge from pure statistics — you realise that the difference between your grid paper and GPT-4 is scale, not sorcery.
The project builds on the CS Unplugged tradition — and, further back, on Claude Shannon’s 1948 work systematically generating synthetic English text from hand-drawn frequency tables — but fills a genuine gap in LLM-specific unplugged resources. The fundamental insight that training is counting and generation is sampling carries through from a single-page bigram grid all the way to a transformer with billions of parameters.

We’ve run these activities with over 400 participants, from primary school students through to senior APS executives, and the format adapts surprisingly well across that range. The most common “aha moment” people report is realising that LLMs are doing probability and randomness at scale — not reasoning, not understanding, but sophisticated pattern matching and weighted sampling. That demystification seems particularly valuable for non-technical participants who may have heard LLMs described in almost magical terms.
For those who want to go deeper, extension lessons cover temperature and truncation, trigrams, hand-crafted attention (we call them “context columns”), word embeddings, LoRA, RLHF, synthetic data, and agentic tool use.
Technical details

Behind the scenes, an automated pipeline takes any plain text file through a Rust CLI to produce a JSON n-gram model, then uses Typst to generate a PDF booklet — everything a participant needs for dice-powered text generation. The same toolchain produces the web-based tools at llmsunplugged.org/tools. Booklets can be generated for bigram, trigram, and 4-gram models from arbitrary source texts, and the progression across model orders demonstrates the fundamental trade-off between context length and output coherence.
All materials are available under CC BY-NC-SA 4.0 at llmsunplugged.org.