GitHub - google/gemma.cpp: lightweight, standalone C++ inference engine for Google's Gemma models.
gemma.cpp
gemma.cpp is a lightweight, standalone C++ inference engine for the Gemma
foundation models from Google.
For additional information about Gemma, see
ai.google.dev/gemma. Model weights, including gemma.cpp
specific artifacts, are available on
kaggle.
Who is this project for?
Modern LLM inference engines are sophisticated systems, often with bespoke
capabilities extending beyond traditional neural network runtimes. With this
comes opportunities for research and innovation through co-desi...
Read more at github.com