GitHub - KhoomeiK/LlamaGym: Fine-tune LLM agents with online reinforcement learning
Fine-tune LLM agents with online reinforcement learning
🔗 Agents for Web Data Extraction
•
🐦 Twitter
LlamaGym
"Agents" originated in reinforcement learning, where they learn by interacting with an environment and receiving a reward signal. However, LLM-based agents today do not learn online (i.e. continuously in real time) via reinforcement.
OpenAI created Gym to standardize and simplify RL environments, but if you try dropping an LLM-based agent into a Gym environment for training, you'd find...
Read more at github.com