Breaking
Tech

AI glossary 2026: the key terms explained, from tokens to hallucinations

TechCrunch1 h ago
An abstract network of connected nodes representing AI systems
An abstract network of connected nodes representing AI systemsPhoto: Google DeepMind / Pexels

Artificial intelligence has developed a vocabulary that moves almost as fast as the technology itself, and the terms now appear routinely in business headlines, product launches and policy debates. Understanding a handful of core concepts makes that flood of news far easier to follow. This glossary walks through the most important ones in plain language, without assuming a technical background.

At the centre of the current wave is the large language model, or LLM. This is a type of AI system trained on enormous amounts of text to predict the most likely next piece of writing given what came before. That simple-sounding mechanism, scaled up massively, is what lets systems such as chatbots produce fluent answers, summaries and code. When people talk about generative AI, LLMs are usually what they mean, alongside models that generate images, audio or video.

To process text, models break it into tokens, which are chunks of text roughly corresponding to words or parts of words. A model does not read letters or sentences the way people do; it works in tokens, and the number of tokens it can consider at once is called its context window. A larger context window lets a model take in more information, such as a long document, before responding.

Training is the process of building a model by exposing it to data and adjusting its internal settings, called parameters, until it performs well. Parameters are the numerical values the model tunes during training, and their count, often in the billions, is a rough proxy for a model's size and capacity. After the initial training, models are frequently refined through fine-tuning, additional training on more specific data to shape their behaviour for a particular task.

One term that has entered everyday use is hallucination. This describes when an AI system produces information that sounds plausible but is false or fabricated. Hallucinations are a fundamental challenge because the model is generating likely-sounding text rather than retrieving verified facts, so it can state incorrect things with the same confident tone as correct ones. Reducing hallucinations is a major focus of current research.

A technique often used to make models more reliable is retrieval-augmented generation, or RAG. Instead of relying solely on what a model learned during training, RAG lets the system look up relevant information from an external source, such as a document database, and use it to ground its answer. This helps keep responses accurate and up to date, since the model can draw on current material rather than only its training data.

Inference is the term for actually running a trained model to get an output, as opposed to training it in the first place. Every time you ask a chatbot a question, that is inference. It matters commercially because inference consumes computing power each time a model is used, and at large scale those costs add up, which is why efficient inference has become a competitive and financial priority for AI companies.

One of the fastest-rising terms is the AI agent. An agent is a system that does not just answer a single prompt but can take a series of actions to accomplish a goal, such as using tools, browsing information or executing steps in sequence. Agents represent a shift from AI as a question-answering tool toward AI that can carry out multi-step tasks, though how capable and reliable they are in practice remains a live debate.

Several other terms recur often. Multimodal describes models that handle more than one type of input or output, such as text and images together. Open-weight or open-source models are those whose underlying parameters are released publicly, allowing others to run and adapt them, in contrast to closed models accessed only through a company's service. Prompt engineering refers to crafting the input given to a model to get better results.

Knowing these terms does not require understanding the mathematics beneath them, and the vocabulary will keep evolving as the field does. But a working grasp of LLMs, tokens, training, hallucinations, inference and agents covers most of what appears in day-to-day coverage, turning otherwise opaque announcements into something a general reader can follow and assess.

This article is an AI-curated summary based on TechCrunch. The illustration is a stock photo by Google DeepMind from Pexels.

Read next