Approximate Nearest Neighbor

Information Retrieval

An algorithm that finds points approximately closest to a query in high-dimensional space, trading small accuracy loss for dramatically faster search over large datasets.

Like finding the closest coffee shop by checking your neighborhood first instead of measuring the distance to every shop in the city.

Approximate Nearest Neighbor (ANN) search is a family of algorithms that find vectors approximately closest to a query vector in high-dimensional space, accepting small accuracy losses in exchange for dramatically faster search times. While exact nearest neighbor search requires comparing the query against every vector in the database (O(n) time), ANN algorithms achieve sub-linear search time, often O(log n).

The most popular ANN algorithms in vector databases include HNSW (Hierarchical Navigable Small World), which builds a multi-layer graph of vectors for efficient navigation; IVF (Inverted File Index), which partitions vectors into clusters and only searches nearby clusters; and product quantization, which compresses vectors to reduce memory usage and speed up distance computation.

ANN search is essential for production RAG systems because exact search becomes prohibitively slow at scale. A brute-force search over 10 million 1536-dimensional vectors takes seconds, while HNSW can find approximate nearest neighbors in milliseconds. The accuracy loss is typically 95-99% recall (meaning 95-99% of the true nearest neighbors are found), which is an acceptable tradeoff for most retrieval applications.

Related Terms

Vector Database FAISS HNSW Embedding

Last updated: February 22, 2026