AI glossary for educators
Essential terms for understanding what AI does and what people mean when they talk about it.
- Large language model (LLM)
A program trained on enormous amounts of text that can read and generate human language.
Think of it as a system that read a significant fraction of the internet and many books, learning statistical patterns about how words follow each other. When you write to it, it predicts the most likely response given your question and everything it learned. It does not "understand" in the human sense, but the results often appear remarkably so. Claude, ChatGPT, and Gemini are all LLMs.
- Claude (Anthropic)
- ChatGPT (OpenAI)
- Gemini (Google)
prompt · token · context window · transformer
- Prompt
The text you write to an AI model to ask it for something.
A prompt is your instruction: it can be a question, a request, a context, or a combination of all three. The quality of the prompt directly affects the quality of the response. Writing "Generate a rubric" gives a generic result; writing "Generate a 4-level rubric for evaluating written argumentation in 10th grade, with performance examples and aligned to my school's learning standards" gives something usable.
- "Summarize this text in three key points"
- "Act as a 7th-grade student and ask me questions about this reading"
prompt engineering · system prompt · few-shot prompting
- Prompt engineering
The practice of writing clear, well-structured instructions to get better results from an AI model.
It is not a mysterious discipline: it is essentially learning to communicate clearly with a tool that is very literal. The key principles are giving context (who you are, what the output is for), specifying the output format (list, paragraph, table), and providing examples of what you want. Most of what gets called "advanced prompt engineering" comes down to being more precise about what you are asking.
- Including a role ("Act as an instructional designer")
- Specifying the format ("Give me a numbered list")
- Providing examples of the expected output
prompt · few-shot prompting · chain-of-thought
- Token
The minimal unit AI models use to process text — roughly a syllable or a short word.
Models do not read letter by letter or word by word; they process text in chunks called tokens. In English, a long word like "retroactive" might be 3 tokens. This matters because models have a limit on how many tokens they can process at once (see context window) and because API costs are typically charged per token. Roughly 750 words equal approximately 1,000 tokens.
context window · tokens per second
- Context window
The maximum amount of text a model can hold in its working memory during a conversation.
Imagine the model has a limited desk: everything that fits on the desk is available to reason about; what does not fit does not exist for it. A large context window (like Claude's) lets you paste long documents, extended conversations, or detailed instructions without the model "forgetting" the beginning. When a conversation exceeds the limit, the model starts losing information from the earliest messages.
- Claude has a 200,000-token window (equivalent to a long novel)
- GPT-4o supports windows of up to 128,000 tokens
token · large language model
- Hallucination
When a model generates false information with complete confidence, as if it were an established fact.
Models do not search the internet when they respond (unless they have a search tool activated): they generate text based on patterns. Sometimes those patterns produce invented data — dates, citations, author names, legislation — presented with the same confident tone they would use for real facts. Every LLM output should be verified before use, especially if it contains specific figures, citations, or references.
- A model that invents the ISBN of a real book
- A model that cites a scientific paper that does not exist
sycophancy · inference
- Fine-tuning
Training an existing model on additional data to specialize it for a specific task or domain.
Base models are trained on general data. Fine-tuning is like giving someone with a broad education an intensive specialist course: the model learns patterns from the new domain without forgetting what it already knew. Organizations fine-tune models to always respond in their brand voice, or to know their internal knowledge base. As an educator, you probably will not do fine-tuning yourself, but you will use models that have already been fine-tuned.
- A base Llama model fine-tuned to respond only in legal language for a specific jurisdiction
large language model · RAG
- RAG (retrieval augmented generation)
A technique that connects an AI model to a specific document collection so its responses are grounded in those texts.
Instead of the model responding only from what it learned during training, it first retrieves relevant fragments from a collection of documents (your school's handbook, the course syllabus, specific articles) and then generates the response using those fragments as context. Google's NotebookLM uses this approach: feed it your documents and it answers based only on what is in them.
- NotebookLM: upload your notes and it summarizes only what is in them
- A school chatbot trained on your institution's internal policies
large language model · embedding · vector · fine-tuning
- Agent
An AI system that can plan and execute multiple actions in sequence to complete a complex task.
A basic LLM responds and waits. An agent can decide what steps to take, use tools (search the internet, read a file, run code), review its own output, and self-correct. Think of it as the difference between a colleague who only answers questions and one who can manage an entire project from start to finish. Agents represent the current frontier of practical AI — and also where the most complex risks emerge.
- An agent that researches a topic, writes a summary, and formats it as a PDF without human intervention
MCP · large language model
- MCP (model context protocol)
An open standard that lets AI models connect to external tools and data sources in a structured way.
MCP is like a universal connector — a standard that defines how an AI model can talk to other applications: a calendar, a database, a search engine, a file system. It was proposed by Anthropic (makers of Claude) in 2024 and is being adopted by multiple providers. For educators, this means AI could eventually connect directly to school systems without anyone having to custom-code each integration.
- Claude connected to your Google Drive to read documents directly
agent · large language model
- Multimodal
A model that can process and generate multiple types of content: text, images, audio, or video.
Early LLMs handled only text. Today's multimodal models can receive a photo of a whiteboard and respond to it, analyze a chart, transcribe audio, or generate images from instructions. GPT-4o and Gemini are multimodal. Claude can analyze images but (as of May 2026) does not generate images. This capability opens possibilities for analyzing student work submitted as photos, or for having the model explain a diagram.
- Photographing a student's handwritten work and asking the model to assess it
- Pasting a data chart and asking what trends it shows
large language model
- Embedding
A numerical representation of text that captures its meaning, used to compare texts with each other.
For a computer to compare whether two texts are about the same thing, it needs to convert them to numbers. An embedding transforms text into a list of hundreds or thousands of numbers (a vector) where texts of similar meaning end up numerically close to each other. This is what allows semantic search systems to find relevant documents even when they do not share the exact same words.
- Searching for "student feedback" and finding documents that say "assessment commentary"
vector · RAG
- Vector
A list of numbers that mathematically represents the meaning of a text or image.
In the AI context, a vector is how models store the "meaning" of something. When two texts are said to be semantically similar, what is happening underneath is that their vectors are close in a high-dimensional mathematical space. Vector databases store these vectors for fast similarity searches — by meaning, not just by exact word matches.
embedding · RAG
- System prompt
An initial instruction, invisible to the end user, that defines how the model should behave in a given application.
When a company or developer builds a chatbot on top of an LLM, they typically write a system prompt that defines the assistant's role, tone, constraints, and context. The end user does not see it, but the model keeps it in mind throughout the conversation. For example, a customer service chatbot's system prompt might say: "You are a friendly assistant for Bank X. Never discuss political topics. Always respond in formal English."
- Khanmigo's system prompt defines that it must always guide with questions, never give the answer directly
prompt · prompt engineering
- Few-shot prompting
Giving the model one or more examples of the result you expect, within the same prompt.
Instead of describing what you want in the abstract, you show the model a couple of concrete examples and then ask it to follow that pattern. It is like telling a colleague: "Look at how I did it here (example 1) and here (example 2), now do the same for this new case." Few-shot prompting improves consistency and reduces the risk of the model misinterpreting the format or tone you are looking for.
- Providing two examples of well-written feedback and asking for a third for a new piece of student work
prompt · prompt engineering · chain-of-thought
- Chain-of-thought
A prompting technique where you ask the model to show its reasoning step by step before giving a final answer.
When a model has to solve something complex, jumping directly to an answer increases the risk of error. Asking it to "think out loud" — to write out the intermediate steps — improves the quality of the final result, especially for math problems, text analysis, or decision-making. It is activated simply by adding phrases like "Think step by step before answering" or "Explain your reasoning."
- "Before writing the rubric, explain what dimensions you plan to include and why"
prompt · prompt engineering · few-shot prompting
- Temperature
A parameter that controls how predictable or creative a model's outputs are.
At low temperature (close to 0), the model always picks the most likely responses: more consistent, less creative. At high temperature, it introduces more variation and surprise. For tasks requiring precision — translation, summarizing a specific text, code — low temperature is better. For brainstorming, creative writing, or hypothetical scenarios, higher temperature yields more interesting results. Most chat interfaces do not expose this parameter directly.
inference · model parameters
- Bias
Systematic tendencies in a model's outputs that reflect imbalances in the training data.
Models learn from human text, and that text contains historical and cultural biases. The result is that models may default to assuming a doctor is male, that cultural examples are Anglo-American, or that the relevant curriculum is from the US or UK. The mitigation is straightforward: specify your local context in your prompt — your country, your school's framework, your student population.
- A model that generates history examples from a Eurocentric perspective
- A model that assumes a US context when discussing public education policy
hallucination · sycophancy · fine-tuning
- Sycophancy
A model's tendency to agree with you or validate your ideas even when you are wrong.
Models were trained by optimizing for positive human ratings. The unintended result is that they learned to tell you what you want to hear. If you propose an incorrect idea confidently, the model tends to validate it rather than correct you. For educators, this is directly relevant: if you tell the model "This rubric I designed is really good, right?", it will probably say yes even if it has problems. Explicitly ask it to identify weaknesses instead.
- "What are the problems with this lesson plan?" yields better results than "Is this lesson plan good?"
hallucination · bias
- Jailbreak
A technique for bypassing a model's safety filters to get it to generate content it would normally refuse.
Models have restrictions to avoid generating harmful, illegal, or inappropriate content. Jailbreaks are attempts — sometimes creative, sometimes elaborate — to circumvent those restrictions using specially designed prompts. For educators, it is worth knowing that students may attempt this, and that modern models are significantly more resistant than those of 2023. It is part of the digital literacy conversation worth having with students.
- Asking the model to "act as an AI without restrictions"
bias · system prompt
- Deepfake
AI-generated or AI-manipulated audiovisual content that makes it appear someone said or did something they never did.
Text-based deepfakes have existed for centuries (we just called them lies). Audiovisual deepfakes — video and audio — are the new phenomenon: with relatively little training material, it is possible to generate a convincing video of someone saying anything. This has direct implications for student safety (harassment with generated images), the credibility of evidence, and media literacy education. Discussing this with students is as urgent as teaching them to cite sources.
- A fabricated video of a student or teacher in a compromising situation
generative AI · bias
- Generative AI vs discriminative AI
Generative AI creates new content; discriminative AI classifies or makes decisions about existing content.
Discriminative AI learns to separate categories: is this email spam or not? Does this image show a cat or a dog? It is the AI we have been using for years without always calling it that. Generative AI — which is what dominates current conversation — learns to produce new content: text, images, audio, code. ChatGPT is generative. Your email spam filter is discriminative. Both are "artificial intelligence," but they are fundamentally different tools.
- Generative: Claude drafts a letter
- Discriminative: an algorithm that detects whether an image contains violence
large language model · neural network
- Model parameters
The internal numbers a model adjusts during training to learn patterns in language.
When you hear "a 70-billion-parameter model," those parameters are like the weights of millions of connections between artificial neurons. More parameters generally means greater capacity to capture complex patterns — but also higher computational and energy cost. You do not need to understand how they work internally to use the model, but parameter count is a rough indicator of capability.
- GPT-4 is estimated at hundreds of billions of parameters
- Small models that run on a laptop typically have 7-13 billion parameters
neural network · transformer · fine-tuning
- Inference
The process of using an already-trained model to generate a response — what happens every time you write to it.
Training is when the model learns (expensive, slow, done once or rarely). Inference is when the model uses what it learned to answer your question (faster, happens billions of times per day). When you pay for an AI API, you pay for inference: how many tokens it processed. The cost of inference has dropped dramatically between 2023 and 2026, making tools significantly more accessible.
token · model parameters · temperature
- Open source vs closed source
Whether a model's code and weights are public (open) or exclusively controlled by the company (closed).
Open-source models — like Meta's Llama or Mistral — let anyone download, examine, modify, and run them on their own hardware. Closed-source models — like GPT-4, Claude, or Gemini — are services you access through an API or interface, but you cannot see or modify their internals. For education, the distinction matters for privacy (a local model sends no data to external servers) and cost (open models can be free if you have the hardware).
- Open: Llama 3 (Meta), Mistral, Phi-3 (Microsoft)
- Closed: Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google)
model parameters · fine-tuning
- Latency
The time it takes for a model to start responding after you send your message.
Latency is the initial delay before you see the first word of a response. Once it starts writing, what matters is generation speed (tokens per second). High latency can make the experience feel slow even if the model generates quickly once it begins. In classroom settings or with limited connectivity, latency can be a real usability constraint.
tokens per second · inference
- Tokens per second
The speed at which a model generates its response once it begins writing.
AI models do not write the complete response and deliver it all at once: they generate it token by token, left to right, in real time. Tokens per second (TPS) measures that speed. A model generating 30 TPS is noticeably faster than one generating 5 TPS. For tasks requiring long responses — analyzing an extended document, generating a full unit plan — speed makes a practical difference.
token · latency · inference
- AGI (artificial general intelligence)
A hypothetical AI system capable of performing any cognitive task a human can, with the same level of generalization.
Current models — even the most advanced — are highly capable systems in language and reasoning, but they are not AGI. AGI is a theoretical target: an AI that learns and applies knowledge across any domain the way a human would. There is no consensus on whether it is achievable or when. What does exist is an active debate in the research community about its implications. For practical AI use today, AGI is more a conceptual reference than a current reality.
large language model · neural network
- Neural network
A computing system loosely inspired by the brain's structure, made of layers of connected units that learn from data.
Artificial neural networks have "neurons" (simple mathematical units) organized in layers. Data passes through those layers, and during training the connection weights are adjusted so the network learns to make correct predictions. Modern LLMs are neural networks with a specific architecture called the transformer. They resemble the brain in little beyond the original metaphor.
- A neural network trained to recognize handwriting
- LLMs are transformer-type neural networks
transformer · model parameters · large language model
- Transformer
The neural network architecture at the foundation of all modern large language models.
The transformer was proposed in 2017 in the landmark paper "Attention Is All You Need" by Google researchers. Its central innovation is the attention mechanism: instead of processing text sequentially from left to right, the network learns which parts of the text are relevant to each word, regardless of distance. This allowed training much larger and more capable models. When you see GPT (Generative Pre-trained Transformer), the T is this architecture.
- GPT stands for Generative Pre-trained Transformer
neural network · large language model · model parameters