"Meta Announces Major Investment in AI Infrastructure: Unveils Two 24k GPU Clusters, Plans to Include 350,000 NVIDIA H100 GPUs by 2024"

Building Meta’s GenAI Infrastructure

Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. We use this cluster design for Llama 3 training. We are strongly committed to open compute and open source. We built these clusters on top of Grand Teton, OpenRack, and PyTorch and continue to push open innovation across the industry. This a...