"Stanford's Hazy Research Develops Monarch Mixer for Long-Context Retrieval Models; Launches Preview of M2-BERT Up To 32K Context Length"

Long-Context Retrieval Models with Monarch Mixer

Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens of thousands of tokens long (or more). Here, we’re taking a first step towards developing long-context retrieval models. We build on Monarch Mixer (M...