News Score: Score the News, Sort the News, Rewrite the Headlines

DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B

20th January 2025 DeepSeek are the Chinese AI lab who dropped the best currently available open weights LLM on Christmas day, DeepSeek v3. That model was trained in part using their unreleased R1 “reasoning” model. Today they’ve released R1 itself, along with a whole family of new models derived from that base. There’s a whole lot of stuff in the new release. DeepSeek-R1-Zero appears to be the base model. It’s over 650GB in size and, like most of their other releases, is under a clean MIT licens...

Read more at simonwillison.net

© News Score  score the news, sort the news, rewrite the headlines