Harnessing 3200 Gbps Network: A Journey with RDMA, EFA, and libfabric
Earlier this year, I had the fortune of joining Perplexity AI, where I finally got to use servers with the most powerful configuration—AWS p5 instances equipped with 8 NVIDIA H100 GPUs interconnected via NVSwitch. What excited me even more was the ultra-high-speed 3200 Gbps network between servers. I thought it would be incredibly cool if I could write a program that could utilize this full 3200 Gbps bandwidth!
Recently, I spent a week exploring this, developed a small proof-of-concept program, ...
Read more at le.qun.ch