"AI Hardware Startup Groq Revolutionizes Industry with High-Speed, Cost-Effective Inference API; Utilizes US-manufactured Chips for Diversified Supply Chain"

Groq Inference Tokenomics: Speed, But At What Cost?

Groq, an AI hardware startup, has been making the rounds recently because of their extremely impressive demos showcasing the leading open-source model, Mistral Mixtral 8x7b on their inference API. They are achieving up to 4x the throughput of other inference services while also charging less than 1/3 that of Mistral themselves.Groq has a genuinely amazing performance advantage for an individual sequence. This could enable techniques such as chain of thought to be far more usable in the real worl...