AI Glossary: Every Term Explained Simply

AI has its own vocabulary, and it can be intimidating. This glossary defines every major term in plain language, organized alphabetically. Bookmark it and refer back whenever you encounter an unfamiliar term.

AI model comparisons are based on publicly available benchmarks and editorial testing. Results may vary by use case.

A

Agent — An AI system that can take actions (browse the web, run code, call APIs) to accomplish multi-step goals, not just generate text.

Alignment — The process of ensuring an AI model’s behavior matches human values and intentions. A well-aligned model is helpful, harmless, and honest.

API (Application Programming Interface) — A way to access AI models programmatically from your code, rather than through a chat interface. You send requests and receive responses over the internet.

How to Use Claude’s API: Beginner Tutorial

B

Batch Processing — Sending many AI requests at once for processing in bulk, often at a discounted rate compared to real-time requests.

Benchmark — A standardized test used to compare AI model performance. Examples include MMLU (general knowledge), HumanEval (coding), and MATH (mathematical reasoning).

AI Benchmark Leaderboard: MMLU, HumanEval, MATH

C

Chain-of-Thought (CoT) — A prompting technique where you ask the AI to reason step by step before giving a final answer. Improves accuracy on complex tasks.

Constitutional AI (CAI) — Anthropic’s alignment technique where the AI evaluates its own outputs against a set of principles (a “constitution”) to improve safety and helpfulness.

Context Window — The maximum amount of text (measured in tokens) that a model can process in a single request. Includes both input and output.

AI Model Context Window Comparison: 8K to 1M Tokens

D

Dense Model — A neural network where all parameters are used for every input. Contrast with Mixture of Experts (MoE), where only a subset is active.

Diffusion Model — A type of AI model used for image generation (e.g., Stable Diffusion, DALL-E). Generates images by progressively refining random noise.

E

Embedding — A numerical representation of text (or images) as a list of numbers (vector). Used for search, similarity comparison, and retrieval-augmented generation.

Emergent Capabilities — Abilities that appear in larger models but are absent in smaller ones, without being explicitly trained for. An area of active debate.

F

Few-Shot Prompting — Providing a few examples of the desired input-output pattern in your prompt before asking the model to perform the task.

Fine-Tuning — Training an existing model on additional data to specialize it for a particular task or domain. Less expensive than training from scratch.

Foundation Model — A large AI model trained on broad data that can be adapted to many tasks. GPT-4, Claude, Gemini, and Llama are all foundation models.

G

Generative AI — AI that creates new content (text, images, audio, video, code) rather than just classifying or analyzing existing content.

GPT (Generative Pre-trained Transformer) — OpenAI’s family of language models. “GPT” has also become a general term for large language models.

Guardrails — Rules and filters that constrain AI model outputs to prevent harmful, inappropriate, or off-topic responses.

H

Hallucination — When an AI model generates information that sounds plausible but is factually incorrect or entirely fabricated.

AI Hallucinations: Why AI Makes Things Up and How to Catch It

I

Inference — The process of running a trained AI model to generate outputs. “Inference cost” is what you pay per query.

Instruction Tuning — Training a model to follow natural language instructions across diverse tasks.

J-K

JSON Mode — A feature in some APIs that forces the model to output valid JSON, useful for structured data extraction.

Knowledge Cutoff — The date after which a model has no training data. It will not know about events that occurred after this date.

L

Large Language Model (LLM) — An AI model with billions of parameters trained on vast amounts of text data. The technology behind ChatGPT, Claude, Gemini, and similar tools.

LoRA (Low-Rank Adaptation) — An efficient fine-tuning technique that modifies only a small subset of model parameters, reducing the cost and compute needed for customization.

M

Mixture of Experts (MoE) — An architecture where only a subset of model parameters are activated for each input. Mixtral uses this approach, achieving high capability with lower inference cost.

Multimodal — An AI model that can process multiple types of input (text, images, audio, video), not just text.

MMLU (Massive Multitask Language Understanding) — A benchmark testing general knowledge across 57 subjects. One of the most widely cited AI benchmarks.

N-O

Neural Network — The underlying architecture of AI models, inspired by biological neurons. Consists of layers of connected nodes that process information.

Open Source / Open Weight — Models whose weights (learned parameters) are publicly available for download and use. “Open source” technically also requires open training data and code.

Open Source vs Closed Source AI: Pros, Cons, and When Each Wins

P

Parameters — The internal values a model learns during training. More parameters generally means more capability. GPT-4 and Claude have hundreds of billions of parameters.

Pre-Training — The initial training phase where a model learns language by predicting the next token on massive text datasets.

Prompt — The input you give to an AI model. Everything from a simple question to a detailed instruction with context and examples.

Prompt Engineering — The skill of crafting effective prompts to get better results from AI models.

Prompt Engineering 101: Get Better Results from Any AI

Q-R

Quantization — Reducing the precision of model parameters (e.g., from 16-bit to 4-bit) to reduce memory requirements and speed up inference, with a modest quality tradeoff.

RAG (Retrieval-Augmented Generation) — A technique where the AI retrieves relevant documents from a knowledge base and includes them in the prompt, grounding responses in specific source material.

RLHF (Reinforcement Learning from Human Feedback) — An alignment technique where human evaluators rate model responses, and the model is trained to produce responses that humans prefer.

S

Streaming — Delivering model output token by token as it is generated, rather than waiting for the complete response. Provides a better user experience for interactive applications.

System Prompt — A special prompt set at the beginning of a conversation that defines the model’s behavior, role, and constraints.

SWE-bench — A benchmark testing AI’s ability to resolve real GitHub issues. Measures practical software development skills.

T

Temperature — A parameter that controls the randomness of model output. Lower temperature (0.0) = more deterministic and focused. Higher temperature (1.0) = more creative and varied.

Token — The basic unit of text that AI models process. Roughly 3/4 of a word in English. Both input and output are measured in tokens.

Token Counter Tool: Paste Text, See Token Count

Transformer — The neural network architecture underlying all modern large language models. Introduced in the 2017 paper “Attention Is All You Need.”

U-V

Vector Database — A database optimized for storing and searching embeddings (numerical representations of text). Used in RAG systems.

W-Z

Weight — Another term for model parameters. “Open-weight” models allow you to download the trained parameters.

Zero-Shot — Asking a model to perform a task without providing any examples. Contrast with few-shot prompting, where examples are provided.

Key Takeaways

AI terminology can be confusing, but most concepts are straightforward once explained in plain language.
The most important terms for everyday AI users are: token, context window, hallucination, prompt engineering, and API.
For developers, understanding RAG, fine-tuning, embeddings, and streaming is essential.
Bookmark this glossary and return to it as you encounter new terms.

Next Steps

Learn prompt engineering fundamentals: Prompt Engineering 101: Get Better Results from Any AI.
Understand how models are trained: How AI Models Are Trained: A Non-Technical Explainer.
Explore the model landscape: Complete Guide to AI Models in 2026: Which One Should You Use?.
Try AI models yourself: AI Model Playground: Side-by-Side Comparison.

This content is for informational purposes only and reflects independently researched comparisons. AI model capabilities change frequently — verify current specs with providers. Not professional advice.