"Startup Reka Navigates Challenges of Training Robust Multimodal Language Models from Scratch: A Tale of Variable Hardware Quality and 'Hardware Lottery'"

Training great LLMs entirely from ground zero in the wilderness as a startup — Yi Tay

Given that we’ve successfully trained pretty strong multimodal language models at Reka, many people have been particularly curious about the experiences of building infrastructure and training large language & multimodal models from scratch from a completely clean slate. I complain a lot about external (outside Google) infrastructure and code on my social media, leading people to really be curious about what are the things I miss and what I hate/love in the wilderness. So here’s a post (finally)...