>_TheQuery
← Glossary

Approximate Nearest Neighbor

Information Retrieval

An algorithm that finds points approximately closest to a query in high-dimensional space, trading small accuracy loss for dramatically faster search over large datasets.

Approximate Nearest Neighbor (ANN) search is a family of algorithms that find vectors approximately closest to a query vector in high-dimensional space, accepting small accuracy losses in exchange for dramatically faster search times. While exact nearest neighbor search requires comparing the query against every vector in the database (O(n) time), ANN algorithms achieve sub-linear search time, often O(log n).

The most popular ANN algorithms in vector databases include HNSW (Hierarchical Navigable Small World), which builds a multi-layer graph of vectors for efficient navigation; IVF (Inverted File Index), which partitions vectors into clusters and only searches nearby clusters; and product quantization, which compresses vectors to reduce memory usage and speed up distance computation.

ANN search is essential for production RAG systems because exact search becomes prohibitively slow at scale. A brute-force search over 10 million 1536-dimensional vectors takes seconds, while HNSW can find approximate nearest neighbors in milliseconds. The accuracy loss is typically 95-99% recall (meaning 95-99% of the true nearest neighbors are found), which is an acceptable tradeoff for most retrieval applications.

Last updated: February 22, 2026