News Score: Score the News, Sort the News, Rewrite the Headlines

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

* Work done while being a visiting student at MIT. 1MIT TL;DR: Diffusion Forcing combines the strength of full-sequence diffusion models and next-token models, acting as either or a mix at sampling time for different applications without retraining. Abstract This paper presents Diffusion Forcing, a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels. We apply Diffusion Forcing to sequence generative modeling by training a ca...

Read more at boyuan.space

© News Score  score the news, sort the news, rewrite the headlines