Foundation Model

Fundamentals

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks, serving as the base layer for many AI applications.

Like a liberal arts degree - it gives you a broad base of knowledge that you can specialize on top of.

A foundation model is a large-scale AI model trained on vast, diverse datasets that can be adapted to perform a wide variety of tasks without being retrained from scratch. The term was coined by Stanford's Institute for Human-Centered AI in 2021 to describe models like GPT, Claude, Gemini, and Llama that serve as a common base (or "foundation") upon which many different applications are built.

Foundation models are distinguished by their generality. A single model can write code, answer questions, translate languages, analyze images, and reason through complex problems. This generality comes from training on internet-scale datasets that cover an enormous range of human knowledge and language patterns. The model learns broad capabilities during pretraining, which can then be refined for specific use cases through fine-tuning, prompt engineering, or retrieval-augmented generation.

The foundation model paradigm has reshaped the AI industry structure. Instead of building task-specific models from scratch, most AI applications now build on top of existing foundation models, either through APIs or by fine-tuning open weight versions. This creates a layered ecosystem where a small number of foundation model providers (Anthropic, OpenAI, Google, Meta) supply the base capability, and a much larger number of companies build specialized applications on top. The rapid expansion of foundation model capabilities, particularly through features like tool calling and MCP connectors, has also raised questions about which application-layer products remain defensible when the foundation layer keeps absorbing new functionality.

Last updated: February 26, 2026

Foundation Model

Related Terms