πŸ‡ Rabbithole ← All lessons Work with us β†’
LLM Β· Day 21 of 30 01 / 05
Learn by clicking Β· ~3 minutes Β· Day 21 of 30

What is an "LLM," really?

An LLM (Large Language Model) is a prediction engine. Trained on a huge amount of text, it does one thing: given the words so far, it predicts the next chunk of text. Think of it as the world's most well-read autocomplete. Here is the whole idea in one picture:

Your text so far
your question, plus everything said before it
β†’feeds
the model
The LLM
patterns learned from lots of text
predict()nextword
β†’outputs
The next chunk
one likely piece of text, then repeat…
πŸ” Predicts one chunk at a time, then loops πŸ“š Learned from patterns in text, not a fact list ✨ Great at language and patterns ⚠️ No built-in truth check

A chunk here means a token: a word or a piece of a word, the small unit the model reads and writes in. The model picks the next token, adds it to the text, then predicts again. That simple loop, run billions of times during training, is why an LLM is both amazing and fallible. You will watch it predict in a minute.

02 / 05 Β· Bust the myth

The myth: "it looks things up."

This is the single most common misunderstanding. Many people picture an LLM as a search engine with a giant database of facts inside it. That is a myth. An LLM does not store or retrieve facts. It predicts text.

The myth: "It has a database of facts and looks up the answer." If that were true, it would either know a fact or politely say it does not. It would not invent things.

The reality: It generates the most likely-sounding next text based on patterns it learned. Often that lands on the truth, because true statements are common in its training text. Sometimes it produces a fluent, confident answer that is simply wrong. This is called a hallucination: text that sounds right but is not.

πŸ” A database (the myth)

What people imagine is happening.

  • βœ“ Stores exact facts in records
  • βœ“ Returns a fact or "not found"
  • βœ“ Same input gives the same row
  • – But this is not how an LLM works

πŸ” A prediction engine (real)

What is actually happening.

  • βœ“ Predicts likely next text from patterns
  • βœ“ Brilliant at phrasing, tone, structure
  • – Has no built-in "is this true?" check
  • – Can be confidently, fluently wrong
The key: "sounds right" and "is right" are two different things, and an LLM optimizes for the first. That is why it can be confidently wrong. Spotting when that happens is its own skill (see Spotting bad output).
03 / 05 Β· Watch it work

Watch it predict, not look up.

Pick a starting phrase. The page reveals the next words an LLM might choose, with rough probabilities. Notice: it is weighing likely text, not fetching a fact from a table.

πŸ”’ This runs entirely in your browser. Nothing is sent anywhere.

These numbers are illustrative only and approximate, hand-picked for teaching. A real model weighs many thousands of possible tokens. The point is the shape of it: several candidates, each with a likelihood, one winner, then it repeats.

πŸ’‘ See this idea doing real work: feeding the model the right text up front (called context) is what makes its predictions land on your facts. That is the whole game behind RAG and prompting.

04 / 05 Β· Use it wisely

Training vs using: what it does and does not know.

Two moments get mixed up constantly. Keep them separate and most "AI is creepy / AI is magic" confusion clears up.

  • 1. Training happened once, before you. The model learned its patterns from text up to a cutoff date, then those patterns were frozen. It is not browsing the live web as it answers (unless a tool is explicitly added).
  • 2. It does not learn from your chats by default. Using it does not silently retrain it. By default your message shapes this answer and is then gone from the model. (Always check a specific vendor's data policy, since settings vary.)
  • 3. It has no clock and no memory of yesterday. Unless you give it today's date or past notes in the text, it does not have them.
  • 4. It will not say "I don't know" on its own. Because it predicts plausible text, it often fills gaps confidently. Treat unverified specifics (numbers, names, quotes, citations) as claims to check.
  • 5. The fix is context, not faith. Give it the right facts in the prompt and its predictions get far more reliable. Good context reduces wrong answers; it does not eliminate them.

Specifics like model names, prices, and context-window sizes change quickly. As of writing, mainstream models read roughly tens to hundreds of thousands of tokens of context at once, with a few going higher. Treat any exact figure as a snapshot, and rely on the durable idea: more relevant context in, more reliable text out.

05 / 05 Β· Done

You now understand LLMs better than most people who use them daily.

You know an LLM is a prediction engine, not a fact database. You know why it can be confidently wrong, that it does not learn from your chats by default, and that the right context is what makes it reliable.

Understanding the engine is step one. We put it to work safely in your business, wired to your real data with the guardrails that keep its predictions honest.

Built by rabbithole.consulting: custom-built infrastructure that runs your business. This lesson runs entirely in your browser Β· Free under MIT.