Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
View PDF
HTML (experimental)
Abstract:Approximate nearest neighbor search for vectors relies on indexes that are most often accessed from RAM. Therefore, storage is the factor limiting the size of the database that can be served from a machine. Lossy vector compression, i.e., embedding quantization, has been applied extensively to reduce the size of indexes. However, for inverted file and graph-based indices, auxiliary data such as vector ids and links (edges) can represent most of the storage c...
Read more at arxiv.org