State-of-the-Art Exact Binary Vector Search for RAG in 100 lines of Julia
May 16, 2024
source
I wanted to experiment in how quickly precise RAG lookups could be performed with a binary vector space.
Why binary?
It turns out the accuracy is very similar to a full 32-bit vector. But we save a lot in terms of server costs, and it makes in-memory retrieval more feasible.
A database that was once 1TB is now ~32GB, which can easily fit in RAM on a much cheaper setup.
Assuming each row is a 1024-dim float32 vector, that's 4096 bytes per row. or ~244 million rows.
For some ad...
Read more at domluna.com