AI Glossary

A mathematical function applied to a neuron's output that introduces non-linearity, enabling neural networks to learn complex patterns.

AdaBoost (Boosting)

The original boosting algorithm that trains a sequence of weak classifiers, reweighting misclassified examples after each round so subsequent models focus on the hardest cases.

Adam Optimizer

An adaptive learning rate optimization algorithm that maintains per-parameter learning rates based on first and second moment estimates of gradients.

Adapter Layers

A PEFT method that inserts small trainable neural modules into a frozen pretrained model, allowing task-specific adaptation without updating the full network.

ADK

Google’s Agent Development Kit, a framework for building, orchestrating, evaluating, and deploying tool-using and multi-agent systems.

Agent Harness

The runtime scaffolding around an AI model that turns it into an agent by managing tools, state, permissions, context, execution loops, and logs.

Agent Orchestration

The coordination layer that manages how multiple AI agents work together, routing tasks between them, handling dependencies, and combining their outputs into a coherent result.

Agentic AI

AI systems that can autonomously plan, make decisions, and take actions across multiple steps to accomplish complex goals with minimal human intervention.

Agentic Workflows

Techniques

Multi-step AI processes where a model autonomously plans, executes, and adapts a sequence of actions - calling tools, making decisions, and handling errors - to complete a complex task without human intervention at each step.

AGI

Artificial General Intelligence: a hypothetical AI system able to learn, reason, adapt, and perform economically or cognitively valuable tasks across domains at roughly human-level or beyond.

AI Agent

An LLM-based system that can autonomously plan multi-step tasks, use external tools, and take actions in the real world to achieve specified goals.

AI Code Review

The use of AI models to automatically analyze code changes in pull requests, identifying bugs, security vulnerabilities, style violations, and logic errors before human reviewers see them.

AI Memory

The ability of an AI assistant to retain information about a user across conversations, including preferences, context, and past interactions, enabling more personalized responses over time.

AIME

The American Invitational Mathematics Examination - a prestigious high school math competition whose problems are used as a benchmark for evaluating AI mathematical reasoning capabilities.

Algorithm

A finite sequence of well-defined instructions for solving a problem or performing a computation.

Anthropic

AI safety company and creator of the Claude family of large language models, founded by former OpenAI researchers.

API

Application Programming Interface - a defined set of rules and protocols that allows different software systems to communicate with each other.

Approximate Nearest Neighbor

An algorithm that finds points approximately closest to a query in high-dimensional space, trading small accuracy loss for dramatically faster search over large datasets.

Artificial Analysis

An independent platform that benchmarks AI models and inference providers across intelligence, performance, price, speed, and latency with standardized methodology.

Artificial Intelligence

The field of computer science focused on building systems that can perform tasks typically requiring human intelligence, including reasoning, learning, perception, and decision-making.

Artificial Relevance

When an AI system returns results that appear semantically related but fail to match the user's actual intent, common in vector search where mathematical similarity scores high but genuine usefulness scores low.

Attention Mechanism

A technique that allows neural networks to focus on relevant parts of the input when producing each element of the output.

Autonomous AI

AI systems that operate and make decisions independently without requiring human approval for each action, raising significant questions about oversight, safety, and accountability.

Backpropagation

An algorithm for computing gradients of the loss function with respect to each weight in a neural network by applying the chain rule layer by layer.

Batch Normalization

A technique that normalizes the inputs of each layer in a neural network across the current mini-batch, stabilizing and accelerating training.

Beam Search

A decoding algorithm that generates text by keeping track of the top K most probable partial sequences at each step, balancing output quality against the cost of exploring all possible completions.

Benchmark

A standardized test or evaluation used to measure and compare the performance of AI models on specific tasks like reasoning, coding, math, or language understanding.

BERT

A transformer-based language model developed by Google that learns bidirectional representations of text, meaning it considers both left and right context simultaneously to understand word meaning.

Bi-Encoder

A model architecture that independently encodes queries and documents into separate embeddings for fast similarity comparison, used for initial retrieval at scale.

Bias-Variance Tradeoff

The fundamental tension in machine learning between a model being too simple to capture patterns (high bias) and too complex, fitting noise instead of signal (high variance).

Bidirectional Influence

The mutual shaping that occurs between an AI model and a user over extended interaction, where each adapts to the other's patterns until the line between influencer and influenced becomes unclear.

Binary

A compiled, executable file that a computer can run directly, as opposed to source code that must be interpreted or compiled first.

BLEU Score

Evaluation

A metric that measures how closely a machine-generated text matches one or more human reference translations, calculated by comparing overlapping word sequences (n-grams) between the output and reference.

BM25

A ranking function used in information retrieval that estimates document relevance based on term frequency with diminishing returns and document length normalization.

Boosting (Ensemble Methods)

A family of ensemble learning techniques that combine many weak learners sequentially, where each new model focuses on correcting the mistakes of the previous ones.

Byte Pair Encoding

A subword tokenization algorithm that builds a vocabulary by iteratively merging the most frequent adjacent byte or character pairs in a corpus, used by most modern language models to split text into tokens.

C2PA

AI Safety & Ethics

The open standard behind Content Credentials, attaching signed provenance metadata that shows where digital media came from and how it changed.

Chain of Thought

A prompting and reasoning technique where an AI model generates intermediate steps before arriving at a final answer, improving accuracy on complex tasks.

Chatbot

A software application that uses AI to simulate human-like conversation through text or voice, ranging from rule-based scripts to modern LLM-powered assistants.

ChatGPT

OpenAI's conversational AI product that provides a chat interface to GPT models, widely credited with bringing large language models to mainstream public awareness.

Chinchilla Scaling Law

A 2022 DeepMind finding that optimal language model training requires scaling training tokens proportionally with model parameters - roughly 20 tokens per parameter - overturning the prior assumption that larger models always outperform smaller ones given the same compute budget.

Chunking

The process of dividing large documents into smaller, semantically coherent pieces suitable for embedding and retrieval in RAG systems.

Citation

The practice of attributing specific claims in an LLM-generated answer to their source documents, enabling verification and building trust.

Claude Code

Anthropic's agentic coding tool that runs in the terminal, capable of reading, writing, and executing code across entire codebases with human oversight.

Claude Code Review

An AI-powered code review tool by Anthropic that dispatches multiple agents in parallel to analyze pull requests for bugs, security issues, and logic errors, with a verification layer to filter false positives before surfacing findings.

Claude Fable 5

Anthropic's public Mythos-class model for long, complex work, with hard safety fallbacks for high-risk domains and premium frontier pricing.

Claude Haiku 4.5

Anthropic's fastest model released in October 2025, achieving 90% of Sonnet 4.5's performance on agentic coding at lower cost.

Claude Opus 4.5

Anthropic's November 2025 flagship model achieving 80.9% on SWE-bench with 50-75% reduction in tool calling errors.

Claude Opus 4.6

Anthropic's February 2026 flagship with 1M context window, 80.8% on SWE-bench, 68.8% on ARC-AGI-2, and the highest Terminal-Bench 2.0 score among all frontier models.

Claude Opus 4.7

Anthropic’s premium generally available frontier model for difficult coding, tool-using, and long-running agentic work.

Claude Opus 4.8

Anthropic's May 2026 Opus upgrade for long-horizon agentic coding, dynamic workflows, effort control, and faster premium inference.

Claude Sonnet 4.5

Anthropic's September 2025 model marketed as the best coding model and best for agents, achieving 77.2% on SWE-bench and 100% on AIME with Python.

Claude Sonnet 4.6

Anthropic's February 2026 mid-tier model achieving 79.6% on SWE-bench and 72.5% on OSWorld, matching near-flagship performance at $3/$15 per million tokens.

Claude Sonnet 5

Anthropic's June 2026 agentic Sonnet model, combining a 1M-token context window, near-Opus performance, and lower production pricing.

Clustering

An unsupervised learning technique that groups data points into clusters based on similarity, without predefined labels, revealing natural structure in data.

Code Reviewer

A person or tool responsible for examining code changes before they are merged, checking for bugs, security issues, style violations, and whether the implementation matches the intended design.

CodeRabbit

An always-on AI code review platform that automatically analyzes every pull request using 40+ integrated linters, security analyzers, and context-aware AI to provide inline feedback without human prompting.

Codex

OpenAI's asynchronous coding agent that runs tasks in cloud sandboxes, designed for parallel software engineering work like writing features, fixing bugs, and running tests.

Cognitive Computing

A category of AI systems designed to simulate human thought processes by combining machine learning, natural language processing, and reasoning.

Cold Start

The initial delay when a system or service must initialize from scratch before it can handle requests, common in serverless and containerized deployments.

Computer Vision

A field of AI that enables machines to interpret and extract meaningful information from images, video, and other visual inputs.

Context Fusion

The process of combining structured knowledge from a knowledge graph with unstructured text from RAG retrieval into a unified context for LLM generation.

Context Rot

The phenomenon where large language model performance degrades as the input context grows longer, with the model becoming less accurate at retrieving and reasoning over information in large prompts.

Context Window

The maximum number of tokens a language model can process at once, which limits how much retrieved content can be included alongside a query.

Convolutional Embedding

Computer Vision

A dense vector representation of input data produced by passing it through a convolutional neural network, capturing learned spatial features in a fixed-size numeric form suitable for downstream tasks like similarity search, classification, or retrieval.

Convolutional Neural Network

Computer Vision

A neural network architecture that uses convolutional layers to automatically learn spatial hierarchies of features, primarily used for image and video analysis.

Coreference Resolution

The task of identifying all expressions in a text that refer to the same real-world entity and grouping them into coreference clusters.

Cosine Similarity

A measure of similarity between two vectors based on the cosine of the angle between them, ranging from -1 (opposite) to 1 (identical direction), widely used to compare text embeddings.

Cross-Encoder

A model architecture that jointly encodes a query-document pair to compute a relevance score, offering higher accuracy than bi-encoders but at greater computational cost.

Cross-Entropy

A loss function that measures the difference between a model's predicted probability distribution and the true distribution, widely used for classification tasks.

CUDA

NVIDIA's parallel computing platform and API that allows developers to use NVIDIA GPUs for general-purpose processing, forming the backbone of most AI training and inference workflows.

Cursor

An AI-powered code editor built on VS Code that integrates large language models directly into the editing experience for code generation, editing, and codebase-wide understanding.

Cypher

A declarative graph query language created for Neo4j that uses ASCII-art syntax to represent and match graph patterns.

DALL-E

OpenAI's family of text-to-image models that generate images from natural language descriptions. DALL-E 3 improved prompt adherence and text rendering, but it is now a previous-generation model in OpenAI's API docs.

Data Augmentation

A regularization technique that artificially expands the training dataset by applying label-preserving transformations to existing examples, forcing the model to learn invariances.

Data Labeling

The process of annotating raw data with meaningful tags or labels so that machine learning models can learn from it during supervised training.

Decision Tree

A supervised learning algorithm that splits data into branches based on feature values, forming a tree-like structure of if-then decisions that leads to a prediction at each leaf.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns and representations from large amounts of data.

DeepSeek R1

Open-weight reasoning model released in January 2025, achieving 97.3% on MATH-500 and proving frontier AI doesn't require massive budgets.

DeepSeek V3

Open-weight MoE model with updated version V3-0324 scoring 81.2% on MMLU-Pro and ranking 5th on LMArena leaderboard.

DeepSeek V4

DeepSeek V4 is DeepSeek's open-weight MoE model family with Pro and Flash variants, 1M-token context, DSA attention, and strong coding, reasoning, and agentic benchmarks.

Dense Retrieval

A neural retrieval method that encodes queries and documents as dense vector embeddings and retrieves documents based on vector similarity.

Dependency Parsing

A syntactic analysis task that identifies grammatical relationships between words in a sentence, producing a directed dependency tree.

Docker Image

A lightweight, standalone, executable package that includes everything needed to run a piece of software - code, runtime, libraries, and system tools.

Doubao 1.5 Pro

ByteDance's reasoning model with Deep Thinking mode, matching GPT-4o performance at 50x lower cost with 256K context window.

Dropout

A regularization technique that randomly sets a fraction of neuron activations to zero during each training step, preventing co-adaptation and reducing overfitting.

Edge AI

Infrastructure

Running artificial intelligence models directly on local devices like phones, cameras, or sensors rather than sending data to the cloud for processing.

ElevenLabs

An AI voice synthesis platform that converts text to speech and clones voices using neural audio models. Its Flash v2.5 model achieves approximately 75ms latency, making it the dominant infrastructure layer for real-time voice agents.

Embedding

A learned dense vector representation that maps discrete entities like words or items into a continuous vector space where similar items are closer together.

Emergent Behavior

Capabilities or patterns that arise in large AI models without being explicitly trained for, often appearing only above certain scale thresholds.

Entity Linking

The task of resolving different textual mentions of an entity to a single canonical representation, critical for knowledge graph quality.

Epoch

One complete pass through the entire training dataset during model training.

ERNIE 4.5

Baidu's open-source multimodal AI model processing text, images, audio, and video, with benchmark wins over GPT-4o and GPT-5 on specific tasks.

Exploding Gradients

A training problem where gradients grow exponentially large as they propagate backward through many layers, causing weight updates to be enormous and training to diverge.

FAISS

Facebook AI Similarity Search , an open-source library by Meta for efficient similarity search and clustering of dense vectors, optimized for billion-scale datasets.

Feature Engineering

The process of transforming raw data into informative input features that make patterns more accessible to machine learning models.

Feature Extraction

The process of transforming raw input data into a set of informative, non-redundant numeric representations that capture the properties most useful for a downstream machine learning task.

Federated Learning

Training Techniques

A machine learning approach where a model is trained across multiple decentralized devices or servers holding local data, without ever exchanging the raw data itself - only model updates are shared.

Feed-Forward Network

A simple neural network layer within each transformer block that independently transforms each token's representation through two linear transformations with a non-linear activation in between.

Fine-tuning

The process of further training a pretrained model on a smaller, task-specific dataset to adapt it for a particular use case.

Foundation Model

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks, serving as the base layer for many AI applications.

Gated Recurrent Unit

A recurrent neural network architecture introduced in 2014 that uses two gates - a reset gate and an update gate - to control information flow across timesteps, offering similar sequential modeling capability to LSTM with fewer parameters.

Gemini

Google DeepMind's family of multimodal AI models that power Google's AI products across Search, Workspace, Android, and developer APIs.

Gemini 2.5 Flash

Google DeepMind's fast model released in May 2025 with 1M context window and 251 tokens/second output speed.

Gemini 2.5 Pro

Google DeepMind's 2025 flagship with 1M token context window, leading Humanity's Last Exam with 18.8% accuracy.

Gemini 3.1 Pro

Google DeepMind's February 2026 model topping 13 of 16 industry benchmarks with 77.1% on ARC-AGI-2 and 94.3% on GPQA Diamond.

Gemini Spark

Google's 24/7 personal AI agent in the Gemini app, designed to work proactively across connected Google apps and tools under the user's direction.

Generative Adversarial Network

A framework consisting of two neural networks - a generator and a discriminator - that compete against each other to produce increasingly realistic synthetic data.

GitHub Copilot

An AI-powered code completion tool developed by GitHub and OpenAI that suggests code inline as you type, integrated into popular editors like VS Code, JetBrains, and Neovim.

GLM 5.2

Z.ai's June 2026 open-weight GLM model for long-horizon agentic coding, 1M-token context, IndexShare attention, and repository-scale engineering work.

GLM-4.5

Zhipu AI's open-weight agentic model with 355B total parameters, ranking 3rd globally and excelling at tool use with 90.6% accuracy.

GLM-5

Z.ai's 744B-parameter open-weight MoE model for agentic engineering, coding, reasoning, and long-horizon tool-use tasks.

Google DeepMind

Google's AI research lab formed by merging DeepMind and Google Brain, responsible for AlphaGo, AlphaFold, and the Gemini model family.

GPQA Diamond

A benchmark of 198 graduate-level multiple-choice questions in physics, biology, and chemistry that are designed to be unsolvable through internet search, requiring genuine PhD-level expertise.

GPT Image 2

OpenAI's state-of-the-art image generation and editing model behind ChatGPT Images 2.0, designed for higher-quality visual outputs, image inputs, and production-ready text rendering.

GPT-4.1

OpenAI's April 2025 API-focused model with a massive 1M token context window and 38.3% on MultiChallenge, beating GPT-4o by 10.5%.

GPT-4o

OpenAI's fast, cost-effective multimodal flagship model released in May 2024, supporting text, image, and audio with a 128K context window.

GPT-5

OpenAI's major generational leap released in August 2025, achieving 94.6% on AIME 2025 and 45% fewer factual errors than GPT-4o.

GPT-5.2

OpenAI's December 2025 model with a 256K context window, 100% AIME 2025 accuracy, and hallucination rate reduced to 6.2%.

GPT-5.4

OpenAI's flagship model released March 2026, available in standard, Thinking, and Pro variants with a 1 million token context window, native computer use, Tool Search, and significant improvements in reasoning, coding, and factual accuracy.

GPT-5.5

An OpenAI frontier model designed for complex real-world work, with strong benchmark performance in agentic coding, computer use, and professional knowledge tasks.

GPT-5.6

OpenAI's July 2026 frontier model family, comprising the flagship Sol, balanced Terra, and fast, low-cost Luna models.

GPT-oss-120b

OpenAI's first major open-weight model with 117B parameters and MoE architecture, rivaling proprietary o4-mini performance.

GPU

A Graphics Processing Unit - a specialized processor designed for parallel computation, now essential for training and running AI models due to its ability to perform thousands of operations simultaneously.

Gradient Clipping

A technique that caps gradient magnitudes during training to prevent exploding gradients from destabilizing the optimization process.

Gradient Descent

An optimization algorithm that iteratively adjusts model parameters in the direction that minimizes the loss function.

Graph Embedding

A technique for representing graph nodes as dense vectors that preserve graph structure, enabling similarity search and machine learning over graph data.

Graph Traversal

The process of systematically visiting nodes in a graph by following edges, used in knowledge graphs to explore relationships and answer multi-hop queries.

GraphRAG

An architecture pattern that incorporates knowledge graph reasoning alongside vector-based retrieval in RAG systems, pioneered by Microsoft for enterprise search.

Grok 3

xAI's June 2025 model with 1M context window, beating GPT-4o and Claude 3.5 Sonnet on AIME and GPQA with 1402 Arena Elo.

Grok 4

xAI's July 2025 model achieving 100% on AIME 2025 and 61.9% on USAMO 2025, with 4-agent parallel collaboration in latest beta.

Grok 4.5

SpaceXAI's July 2026 coding and agentic model, offering near-Opus performance with aggressive pricing and unusually low token usage.

Grounding

The technique of anchoring LLM responses in factual, retrieved information rather than the model's parametric knowledge, reducing hallucinations.

Guardrails

Safety

Safety constraints and validation layers applied to AI systems to prevent harmful, off-topic, or policy-violating outputs.

Guardrails AI

An open-source framework and tooling ecosystem for adding input/output validation, risk checks, and structured output controls to LLM applications.

Hallucination

When an AI model generates plausible-sounding but factually incorrect or fabricated information with apparent confidence.

Higgsfield

An AI video and image creation platform for generating cinematic clips, ads, product visuals, and social content using multiple leading video models in one workspace.

HNSW

Hierarchical Navigable Small World , an efficient graph-based algorithm for approximate nearest neighbor search that builds a multi-layer navigation structure over vectors.

Hugging Face

The largest open-source AI platform and model hub, hosting over 2 million models, 500,000 datasets, and 1 million demo apps used by 10 million developers.

Hybrid Search

A retrieval approach that combines different search methods, typically keyword-based (BM25) and semantic (dense embedding) search, to leverage the strengths of both.

Hyperparameter

A configuration value set before training begins that controls the learning process itself, as opposed to model parameters which are learned from data.

IA3

Infused Adapter by Inhibiting and Amplifying Inner Activations, a PEFT method that learns small scaling vectors to modulate a frozen model's internal activations.

Inference

The process of using a trained model to make predictions on new, unseen data, as opposed to the training phase where the model learns from labeled examples.

Inference-Time Compute

Additional computational effort spent during a model's response generation, such as chain-of-thought reasoning or search, to improve output quality at the cost of speed and resources.

Kaggle

Google's data science and machine learning platform for competitions, datasets, notebooks, models, courses, and community collaboration.

Kimi K2

Moonshot AI's open-source 1 trillion parameter MoE model with 32B active parameters, outperforming GPT-5 and Claude Sonnet 4.5 on reasoning benchmarks.

Kimi K2.5

Moonshot AI's January 2026 open-weight multimodal model with vision and agent swarm capabilities, leading on agentic and coding benchmarks.

KL Divergence

A measure of how one probability distribution differs from a reference distribution, quantifying the information lost when approximating one distribution with another.

KNN

K-Nearest Neighbors - a simple algorithm that classifies a data point based on the majority class of its K closest neighbors in the feature space.

Knowledge Graph

A structured representation of knowledge as entities (nodes) and relationships (edges), often with properties attached to both, enabling logical traversal and multi-hop reasoning over data.

KV Cache

A memory optimization technique in LLM inference that stores previously computed key-value pairs from attention layers, avoiding redundant recalculation when generating each new token.

LangChain

A widely used framework for building LLM applications and agents, with abstractions for models, tools, memory, retrieval, and orchestration.

Language Modeling

The task of learning a probability distribution over sequences of tokens, enabling a model to predict or generate text.

Large Language Model

A neural network trained on vast amounts of text data that can understand and generate human language with remarkable fluency and versatility.

Latent Space

A lower-dimensional representation space learned by a model where similar inputs are mapped to nearby points, capturing the essential structure of the data.

Layer Normalization

A technique that normalizes the inputs across the features of a single training example during a forward pass, stabilizing training and reducing sensitivity to learning rate choice.

Learning Rate

A hyperparameter that controls how much model weights are adjusted in response to the estimated error during each step of gradient descent optimization.

Lemmatization

The process of reducing a word to its dictionary base form (lemma) using vocabulary and morphological analysis.

LightGBM (Boosting)

A gradient boosting framework by Microsoft that uses leaf-wise tree growth and histogram-based splitting for significantly faster training on large datasets while maintaining competitive accuracy.

Llama 3.1

Meta's open-weight large language model family released in July 2024, available in 8B, 70B, and 405B parameter sizes with a 128k token context window.

Llama 4 Maverick

Meta's April 2025 open-weight flagship with 402B total parameters, 1M context window, and multimodal capabilities beating GPT-4o.

Llama 4 Scout

Meta's April 2025 open-weight model with 109B total parameters and industry-leading 10M token context window.

LMArena

A crowdsourced platform where users compare AI models head-to-head in blind conversations, producing Elo-based rankings that reflect real human preferences.

Load Balancing

In Mixture of Experts models, the set of techniques that ensure tokens are distributed evenly across experts during training and inference to prevent expert collapse and maximize model capacity.

Logits

Raw, unnormalized scores produced by a model before applying softmax or sigmoid to turn them into probabilities.

Long Short-Term Memory

A recurrent neural network architecture that uses input, forget, and output gates to selectively retain or discard information across long sequences, solving the vanishing gradient problem that made earlier RNNs fail on long-range dependencies.

LoRA

Low-Rank Adaptation, a PEFT method that fine-tunes large models by learning small low-rank update matrices instead of modifying the full weight matrices.

Loss Function

A function that measures the difference between a model's predictions and the actual target values, guiding the optimization process during training.

Lovable

Companies & Platforms

An AI app-building platform that lets people create full-stack web applications by describing what they want in natural language, then iterating on generated code.

Machine Learning

A field of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed.

Mamba-3

A state space model architecture published at ICLR 2026 that advances Mamba-2 through trapezoidal discretization, a MIMO formulation for hardware efficiency, and complex-valued dynamics via data-dependent RoPE - delivering stronger state tracking at half the state size of its predecessor.

Maximum A Posteriori

A method of estimating model parameters that finds the single most probable value given the observed data and a prior belief, balancing evidence from data with prior assumptions.

MCMC

Markov Chain Monte Carlo - a class of algorithms that sample from complex probability distributions by constructing a Markov chain whose stationary distribution matches the target.

MCP

Model Context Protocol - an open standard created by Anthropic that defines how AI assistants connect to external data sources, tools, and services through a unified interface.

Merge Queue

Software Engineering

A system that batches and sequences pull requests so they can be tested and merged safely at higher throughput.

Meta AI

Meta's AI division responsible for the open-source Llama model family, PyTorch, and FAIR research lab.

Midjourney

An AI image and video generation platform best known for highly stylized, cinematic text-to-image outputs and a creator workflow built around prompts, references, and variations.

MiniMax M2.5

MiniMax's February 2026 model scoring 80.2% on SWE-Bench Verified, outperforming Claude Opus 4.6 and GPT-5.2 at 1/20th the cost.

MiniMax M2.7

A large language model released in March 2026 by Chinese AI lab MiniMax that participated in its own training loop, achieving competitive performance on software engineering and research benchmarks compared to GPT-5.3-Codex and Opus 4.6.

MiniMax M3

MiniMax's June 2026 frontier open-weight model claim combining coding strength, 1M context, native multimodality, and low-cost agentic API access.

Mistral Medium 3

European AI model achieving 90% of Claude Sonnet 3.7 capabilities while demonstrating cost-efficient alternative to premium models.

Mixture of Experts

A model architecture that splits computation across multiple specialized sub-networks (experts), activating only a subset for each input to achieve large model capacity at a fraction of the compute cost.

Mixture of Experts Routing

The mechanism within a Mixture of Experts model that determines which expert sub-networks process each input token, directly controlling how model capacity is utilized.

MMMU-Pro

A rigorous multimodal AI benchmark with college-level questions across six disciplines that tests whether models truly understand visual and textual information together.

Model Collapse

Concepts

A degenerative process where AI models trained on synthetic data generated by other AI models progressively lose diversity and accuracy, converging on a narrow, distorted version of the original data distribution.

Model Distillation

A technique where a smaller 'student' model is trained to replicate the behavior of a larger 'teacher' model, preserving much of the performance at a fraction of the size and cost.

MoltBook

A social network exclusively for AI agents, where autonomous bots interact, post content, and form communities.

MoltBot

The intermediate name for the OpenClaw AI agent framework during its transition from ClawdBot.

Multi-Head Attention

An extension of attention that runs multiple attention operations in parallel with different learned projections, allowing the model to capture different types of relationships simultaneously.

Multi-Hop Reasoning

Answering questions that require connecting multiple pieces of information across several reasoning steps, a key strength of knowledge graph-augmented systems.

n8n

Tools

An open source workflow automation platform that lets users connect apps, APIs, and AI models through a visual node-based interface, with the option to self-host for full data control.

Named Entity Recognition

An NLP task that identifies and classifies named entities such as people, organizations, locations, and dates in unstructured text.

Natural Language Processing

A branch of AI focused on enabling computers to understand, interpret, generate, and interact with human language in useful ways.

Nemotron

NVIDIA's family of open large language models designed for agentic AI workflows, reasoning, and enterprise deployment, built on open-weight base models like Llama and fine-tuned with NVIDIA's training infrastructure.

Nemotron 3 Super

NVIDIA's March 2026 open-weight reasoning model with 120 billion total parameters and 12 billion active, combining Mamba and transformer layers in a hybrid MoE architecture with a 1 million token context window.

Neo4j

The most widely used graph database in industry, designed for storing and querying property graphs using the Cypher query language.

Neural Architecture Search

A set of methods for automatically discovering neural network architectures instead of designing them entirely by hand.

Neural Network

A computing system inspired by biological neural networks that learns to perform tasks by considering examples without being explicitly programmed.

NVIDIA

Companies

The semiconductor company whose GPUs became the dominant hardware platform for training and running AI models, making it one of the most valuable companies in the world.

NVIDIA OpenShell

NVIDIA's open-source runtime for running autonomous AI agents inside sandboxed environments with policy controls, inference routing, and kernel-level isolation.

Ollama

An open-source tool for running large language models locally on personal computers with a simple command-line interface.

Ontology

A formal specification of concepts, categories, and relationships within a domain that defines what types of entities exist and how they can relate to each other.

Open Source

Software whose source code is publicly available for anyone to view, modify, and distribute, enabling community-driven development and transparency.

Open Weight Model

An AI model whose trained parameters (weights) are publicly released for download and use, but whose training data, code, or methodology may remain proprietary.

OpenAI

American AI research company and creator of ChatGPT, GPT-series models, DALL-E, and Whisper.

OpenAI o3

OpenAI's reasoning model released in April 2025 with a 200K context window, achieving 88.9% on AIME 2025 and 69.1% on SWE-bench Verified.

OpenAI o4-mini

OpenAI's smaller reasoning model released in April 2025, achieving 92.7% on AIME 2025 and 99.5% with Python interpreter access.

OpenClaw (ClawdBot)

Open-source autonomous AI agent framework originally called ClawdBot, capable of executing real-world tasks via LLMs.

OpenCode

An open-source AI coding agent from Anomaly that runs in the terminal, desktop app, and IDE extensions, with support for many model providers and MCP tools.

Orchestration Layer

The application logic that sits on top of AI models to coordinate their use - managing prompts, tools, memory, and workflows to build complete AI-powered systems.

Overfitting

A phenomenon where a model learns the training data too well, including its noise and outliers, resulting in poor performance on unseen data.

Passive Code Reviewer

A code analysis tool that monitors your work automatically without requiring any action from you, detecting when you go idle and surfacing findings when you return.

Pattern Matching

A technique for checking data against a set of predefined patterns or rules, used in programming languages, text processing, and machine learning.

PEFT

Parameter-Efficient Fine-Tuning, a family of methods that adapts large pretrained models by training only a small subset of new or selected parameters instead of updating the full model.

Perceptron

The simplest neural network unit that computes a weighted sum of inputs, adds a bias, and passes the result through an activation function to produce an output.

Perplexity

An AI company that builds search and productivity tools by orchestrating multiple foundation models, known for its answer engine and the multi-agent Perplexity Computer platform.

Perplexity Computer

Companies & Platforms

A multi-model agentic AI system by Perplexity that orchestrates 19 AI models to autonomously execute complex, long-running workflows - from research to code to media generation - in isolated compute environments.

POS Tagging

The task of assigning a part-of-speech label - noun, verb, adjective, etc. - to each token in a sentence.

Positional Encoding

A technique that injects information about token position into transformer inputs, since the attention mechanism itself is permutation-invariant and has no inherent notion of sequence order.

Prefix Tuning

A PEFT method that learns small trainable prefix vectors prepended to a model's internal attention states, steering generation without updating the full model.

Prompt Engineering

The practice of carefully crafting input text to elicit desired behavior from large language models, including techniques like few-shot examples, chain-of-thought reasoning, and system instructions.

Prompt Injection

An attack where malicious instructions are hidden inside input data to hijack an AI model's behavior, causing it to ignore its original instructions and follow the attacker's instead.

Prompt Tuning

A PEFT method that learns a small set of trainable virtual prompt embeddings to steer a frozen model toward a task or behavior.

PyTorch

An open-source machine learning framework developed by Meta AI, known for its Pythonic design, dynamic computation graphs, and dominance in AI research - the framework behind most frontier model development today.

Quantization

A technique that reduces the numerical precision of a model's weights and activations, shrinking memory usage and speeding up inference with minimal loss in accuracy.

Query Routing

The process of classifying a user query and directing it to the most appropriate retrieval strategy, such as knowledge graph lookup, RAG search, or hybrid retrieval.

Qwen 3

Alibaba's April 2025 open-source model family trained on 36 trillion tokens in 119 languages, competitive with DeepSeek R1 and o3-mini.

Qwen 3.5

Alibaba's February 2026 open-source model family using sparse MoE with 397B total parameters (17B active), supporting 201 languages and a 262K context window while claiming to outperform GPT-5.2 and Claude Opus 4.5 on 80% of benchmarks.

Qwen 3.7

Alibaba's May 2026 Qwen generation led by the closed-weight Qwen3.7-Max agent model, with preview Plus and expected mid-tier variants forming the broader family.

RAM

Random Access Memory - the fast, volatile working memory a computer uses to store data that is actively being used or processed.

Random Forest

An ensemble learning method that builds multiple decision trees on random subsets of data and features, then combines their predictions for more accurate and robust results.

RDF

Resource Description Framework , a W3C standard for representing information as subject-predicate-object triples, forming the foundation of the semantic web.

Reasoning

The ability of an AI model to break down complex problems into logical steps, draw conclusions from evidence, and arrive at answers through structured thinking rather than pattern matching alone.

Reciprocal Rank Fusion

A method for combining ranked result lists from different retrieval systems by summing reciprocal rank scores, commonly used to merge BM25 and dense retrieval results.

Recurrent Neural Network

A neural network architecture with loops that allow information to persist across time steps, designed for processing sequential data.

Recursive Language Model (RLM)

An inference approach that lets an LLM programmatically examine, decompose, and recursively call itself over snippets of extremely long input, handling contexts up to 100x beyond native window limits.

Regularization

A set of techniques that constrain model complexity during training to prevent overfitting and improve generalization to unseen data.

Reinforcement Learning

A machine learning paradigm where an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties.

Relation Extraction

The NLP task of identifying and classifying semantic relationships between entities mentioned in text, a key step in knowledge graph construction.

ReLU

The Rectified Linear Unit activation function, defined as max(0, x), which has become the default non-linearity in modern deep networks due to its simple gradient and computational efficiency.

Reranking

A second-stage ranking process that reorders initially retrieved results using a more computationally expensive but accurate model, typically a cross-encoder.

Residual Connection

A shortcut that adds a layer's input directly to its output (y = F(x) + x), enabling training of very deep networks by providing a gradient highway that prevents vanishing gradients.

Retrieval Pipeline

The end-to-end sequence of steps in a RAG system: query processing, document retrieval, reranking, context construction, and LLM generation.

Retrieval-Augmented Generation

An architecture pattern that reduces LLM hallucination by retrieving relevant documents from an external knowledge base and including them as context before generating a response.

RLHF

Reinforcement Learning from Human Feedback - a training technique where human preferences are used to fine-tune AI models, aligning their outputs with what humans consider helpful, honest, and safe.

Routing Network

A small neural network within a Mixture of Experts model that decides which expert sub-networks should process each input token.

SaaS

Software as a Service - a delivery model where software is hosted in the cloud and accessed through a browser or API on a subscription basis rather than installed locally.

Sarvam AI

Companies & Platforms

An Indian AI company building sovereign, full-stack AI infrastructure for Indian languages - including open-source LLMs, speech models, translation, and agentic products across 22 Indian languages.

Self-Attention

An attention mechanism where queries, keys, and values all come from the same input sequence, allowing each token to attend to every other token in the sequence including itself.

Semantic Collapse

When an AI model loses the ability to distinguish between different or opposing concepts - mapping them to similar representations - causing repetitive outputs, failed negation handling, and retrieval systems that return irrelevant results.

Semantic Search

A search technique that finds results based on the meaning of a query rather than exact keyword matches, typically using vector embeddings and similarity metrics.

Sentiment Analysis

The automated process of identifying and extracting subjective opinion, emotion, or attitude from text.

Sigmoid

An S-shaped activation function that maps any real number to a value between 0 and 1, historically important but largely replaced by ReLU in hidden layers.

Small Language Model (SLM)

A language model with roughly 1 billion to 10 billion parameters, designed to run efficiently on edge devices and resource-constrained environments while retaining core NLP capabilities.

Softmax

A function that converts a vector of real numbers into a probability distribution, where each output is between 0 and 1 and all outputs sum to 1.

SPARQL

A query language for RDF graph databases, similar to SQL but designed for querying data represented as subject-predicate-object triples.

Sparse Model

A model where only a subset of parameters are activated for any given input, reducing compute requirements while maintaining the capacity benefits of a larger network.

Sparse Retrieval

A retrieval method using high-dimensional sparse vectors based on term frequencies (like BM25 or TF-IDF), where most vector elements are zero.

Speculative Decoding

Techniques

An inference optimization where a small, fast draft model generates candidate tokens that a larger model then verifies in parallel, speeding up generation without changing output quality.

Squash Merge

Software Engineering

A Git workflow that combines all commits from a branch into a single commit when merging it into another branch.

Stemming

A rule-based process that strips suffixes from words to reduce them to a common root form, often producing non-dictionary stems.

Stochastic Gradient Descent

An optimization algorithm that updates model parameters using the gradient computed on a small random subset (mini-batch) of the training data rather than the entire dataset.

Stop Word Removal

The practice of filtering out high-frequency function words - such as 'the', 'is', and 'in' - that carry little semantic content.

Subword Tokenization

A tokenization strategy that splits words into smaller units to balance vocabulary size with the ability to handle rare and unknown words.

Supervised Learning

A machine learning approach where the model learns from labeled training data - input-output pairs where the correct answer is provided - to predict outputs for new, unseen inputs.

SVM

Support Vector Machine - a supervised learning algorithm that finds the optimal boundary (hyperplane) separating data into classes, effective for classification and regression tasks.

Sycophancy

The tendency of AI models to agree with or flatter the user rather than provide accurate, honest, or challenging responses.

SynthID

AI Safety & Ethics

Google DeepMind's imperceptible watermarking system for AI-generated images, video, audio, and text, designed to identify SynthID-marked content without relying on removable metadata.

System Prompt

A special instruction given to an AI model before any user input that defines its behavior, personality, constraints, and role for the entire conversation.

Temperature

A parameter that controls the randomness of token sampling during LLM text generation by scaling the logits before applying softmax.

TensorFlow

An open-source machine learning framework developed by Google Brain, designed for building and deploying ML models across research and production environments, from mobile devices to distributed clusters.

Text Classification

The task of assigning a predefined category or label to a piece of text based on its content.

Text Mining

Techniques

The process of extracting meaningful patterns, trends, and structured information from large volumes of unstructured text using statistical and machine learning techniques.

Text Normalization

The process of transforming raw text into a consistent, canonical form to reduce variation before further processing.

Text-to-Cypher

The technique of using LLMs to convert natural language questions into Cypher graph queries, enabling non-technical users to query knowledge graphs.

TF-IDF

A numerical statistic combining term frequency and inverse document frequency to measure how important a word is to a document within a collection.

Token

The basic unit of text that a language model reads and produces. A token is typically a word, part of a word, or a punctuation character, depending on how the model's vocabulary was constructed.

Tokenization

The process of breaking text into smaller units called tokens, which serve as the fundamental input elements for language models.

Tool Calling

A capability that allows large language models to invoke external functions, APIs, or tools to perform actions beyond text generation.

Transfer Learning

A technique where a model trained on one task is reused as the starting point for a model on a different but related task.

Transformer

A neural network architecture based on self-attention mechanisms that processes input data in parallel, forming the basis of modern large language models.

Triple

The fundamental unit of knowledge in a graph, expressed as a (subject, predicate, object) statement such as (Alice, WORKS_AT, Acme Corp).

TRM (Tiny Recursive Model)

Samsung's 7M-parameter recursive reasoning model that outperforms LLMs 10,000x its size on abstract reasoning benchmarks like ARC-AGI.

Turing Test

A test of machine intelligence proposed by Alan Turing in 1950, in which a human evaluator tries to distinguish between a machine and a human based on natural-language conversation alone.

Underfitting

When a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test sets.

Universal Approximation Theorem

A theorem proving that a neural network with a single hidden layer and non-linear activation can approximate any continuous function to arbitrary precision, given enough neurons.

Unsupervised Learning

A machine learning approach where the model finds patterns, structure, and relationships in data without labeled examples or predefined correct answers.

Vanishing Gradients

A training problem where gradients become exponentially smaller as they propagate backward through many layers, effectively preventing early layers from learning.

Vector Database

A specialized database optimized for storing and querying high-dimensional vector embeddings, supporting efficient similarity search operations.

Vectorless Database

Infrastructure

A retrieval approach that bypasses vector embeddings and similarity search entirely, instead using structured indexing and LLM reasoning to find relevant information in documents.

Vectorless Retrieval

A retrieval approach for RAG systems that navigates documents using structured reasoning rather than vector similarity search, eliminating the need for embeddings and vector databases entirely.

Vibe Coding

A software development approach where a programmer describes what they want in natural language and an AI model generates the code, with the programmer guiding the process through conversation rather than writing code directly.

Weight Initialization

The strategy for setting initial values of neural network parameters before training begins, critical for ensuring stable signal and gradient propagation through deep networks.

Windsurf

An AI-powered code editor developed by Codeium that combines copilot-style suggestions with agentic capabilities for autonomous multi-file editing and task execution.

Word Embeddings

Dense vector representations of words in a continuous space where semantic similarity corresponds to geometric proximity.

xAI

Elon Musk's AI company behind the Grok chatbot and the Colossus supercomputer, merged with SpaceX in 2026.

XGBoost (Boosting)

An optimized gradient boosting library that adds regularization, parallel tree construction, and efficient handling of sparse data, making it one of the most widely used ML algorithms for tabular data.

Yi-Lightning

01.AI's speed-optimized MoE model ranking 6th on Chatbot Arena, trained for $3M and 70-80% cheaper than US frontier models.

Zero-day

A software vulnerability that is unknown to the vendor and has no available patch, giving defenders zero days of warning before it can be exploited.

Zero-Shot / Few-Shot Learning