Meta Open-Sources MEGALODON LLM for Efficient Long Sequence Modeling
Researchers from Meta, University of Southern California, Carnegie Mellon University, and University of California San Diego recently open-sourced MEGALODON, a large language model (LLM) with an unlimited context length. MEGALODON has linear computational complexity and outperforms a similarly-sized Llama 2 model on a range of benchmarks.
MEGALODON is designed to address several shortcomings of the Transformer neural architecture underlying most LLMs. Instead of the standard multihead attention,...
Read more at infoq.com