How Meta trains large language models at scale
As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challenging shifts we’ve experienced is the sheer scale of computation required to train large language models (LLMs).
Traditionally, our AI model training has involved a training massive number of models that required a comparatively smaller number of GPUs. This was the case for our recommendation models (e.g., our feed and ranking models) that would ingest vast amou...
Read more at engineering.fb.com