News Score: Score the News, Sort the News, Rewrite the Headlines

Video models are zero-shot learners and reasoners

Video models are zero-shot learners and reasoners. Fascinating new paper from Google DeepMind which makes a very convincing case that their Veo 3 model - and generative video models in general - serve a similar role in the machine learning visual ecosystem as LLMs do for text. LLMs took the ability to predict the next token and turned it into general purpose foundation models for all manner of tasks that used to be handled by dedicated models - summarization, translation, parts of speech tagging...

Read more at simonwillison.net

© News Score  score the news, sort the news, rewrite the headlines