News Score: Score the News, Sort the News, Rewrite the Headlines

EleutherAI releases massive AI training dataset of licensed and open domain text | TechCrunch

EleutherAI, an AI research organization, has released what it claims is one of the largest collections of licensed and open-domain text for training AI models. The dataset, called the Common Pile v0.1, took around two years to complete in collaboration with AI startups Poolside, Hugging Face, and others, along with several academic institutions. Weighing in at 8 terabytes in size, the Common Pile v0.1 was used to train two new AI models from EleutherAI, Comma v0.1-1T and Comma v0.1-2T, that Eleu...

Read more at techcrunch.com

© News Score  score the news, sort the news, rewrite the headlines