>_TheQuery
← Glossary

Unsupervised Learning

Fundamentals

A machine learning approach where the model finds patterns, structure, and relationships in data without labeled examples or predefined correct answers.

Imagine sorting a pile of unlabeled photos into albums - nobody tells you the categories, you figure out the groupings yourself based on what you see.

Unsupervised learning works with unlabeled data. The model receives inputs with no corresponding correct outputs and must discover meaningful structure on its own. Instead of learning to map inputs to known answers, it learns the underlying patterns, groupings, and distributions in the data.

The main unsupervised learning tasks are:

Clustering groups similar data points together without being told what the groups should be. K-means clustering, DBSCAN, and hierarchical clustering are common algorithms. Applications include customer segmentation, anomaly detection, and grouping similar documents.

Dimensionality reduction compresses high-dimensional data into fewer dimensions while preserving meaningful structure. PCA (Principal Component Analysis) and t-SNE are widely used for visualization and feature compression. Autoencoders learn compressed representations using neural networks.

Density estimation models the probability distribution of the data itself. This enables anomaly detection (points in low-density regions are unusual) and generative modeling (sampling new data points from the learned distribution).

Unsupervised learning is valuable when labeled data is scarce or expensive to obtain, which is most real-world scenarios. It is also used as a preprocessing step for supervised learning - clustering can reveal natural groupings that inform labeling, and dimensionality reduction can make downstream models faster and more effective.

Modern self-supervised learning, which powers the pre-training of large language models and vision models, is closely related to unsupervised learning. The model generates its own supervision signal from the structure of the data (predicting masked words, predicting the next token) rather than relying on human-provided labels.

Last updated: March 10, 2026