laion/rl_r2egym-nl2bash-swesmith-pymethods2test_terminus-structured
The laion/rl_r2egym-nl2bash-swesmith-pymethods2test_terminus-structured model is an 8 billion parameter Qwen3-based language model, developed by laion, specifically trained with Reinforcement Learning (RL) for structured tool calls. It excels at code generation and problem-solving tasks, demonstrating a 42% pass@3 on SWEBench-100 and 94-100% pass@8 on Pymethods2test. This model is optimized for agentic workflows, utilizing bash, view, edit, create, and search tools within a 32k token context window.
Loading preview...
Overview
This model, developed by laion, is an 8 billion parameter Qwen3-based language model that has undergone extensive Reinforcement Learning (RL) training. It is specifically designed for agentic problem-solving, integrating structured tool calls for tasks requiring interaction with environments. The training pipeline involved multiple RL stages, starting with supervised fine-tuning (SFT) on datasets like r2egym, nl2bash, and swesmith, followed by progressive RL training on mixed datasets, full r2egym, and pymethods2test.
Key Capabilities
- Structured Tool Use: Trained with a
terminus-structuredagent, enabling the use ofbash,view,edit,create, andsearchtools for complex interactions. - Code Generation & Problem Solving: Achieves a 42% pass@3 on SWEBench-100, an improvement over its base model's 37%, and 94-100% pass@8 on Pymethods2test.
- Enhanced SWEBench Performance: Solved 14 SWEBench tasks that the base model could not.
- Extended Context Window: Operates with a 32,768 token context length (24k input + 8k output).
Good For
- Automated Software Engineering: Ideal for tasks requiring automated code fixes, refactoring, and testing.
- Agentic Workflows: Suitable for applications where an LLM needs to interact with a system using defined tools.
- Complex Code Generation: Excels in scenarios demanding high accuracy in generating and verifying code solutions.