PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch
View PDF
HTML (experimental)
Abstract:CUDA Graphs -- a recent hardware feature introduced for NVIDIA GPUs -- aim to reduce CPU launch overhead by capturing and launching a series of GPU tasks (kernels) as a DAG. However, deploying CUDA Graphs faces several challenges today due to the static structure of a graph. It also incurs performance overhead due to data copy. In fact, we show a counter-intuitive result -- deploying CUDA Graphs hurts performance in many cases.
We introduce PyGraph, a novel ...
Read more at arxiv.org