GitHub - deepseek-ai/DualPipe
DualPipe
DualPipe is an innovative bidirectional pipeline parallism algorithm introduced in the DeepSeek-V3 Technical Report. It achieves full overlap of forward and backward computation-communication phases, also reducing pipeline bubbles. For detailed information on computation-communication overlap, please refer to the profile data.
Schedules
Example DualPipe scheduling for 8 PP ranks and 20 micro-batches in two directions.
The micro-batches in the reverse direction are symmetric to those in ...
Read more at github.com