laion/rl_mixed-struct-step37_terminus-structured
The laion/rl_mixed-struct-step37_terminus-structured model is an 8 billion parameter, RL-trained Qwen3-based language model with a 32K context window, developed by laion. It is specifically optimized for structured tool calls, including bash, view, edit, create, and search, making it highly effective for software engineering tasks. This model achieves a 42% pass@3 on the SWEBench-100 benchmark, demonstrating its strong performance in automated code repair and development.
Loading preview...
Overview
laion/rl_mixed-struct-step37_terminus-structured is an 8 billion parameter language model based on the Qwen3 architecture, developed by laion. It has been fine-tuned using Reinforcement Learning (RL) over 37 steps with a terminus-structured agent, specifically designed for structured tool interactions. This model excels at tasks requiring the use of external tools like bash, view, edit, create, and search.
Key Capabilities & Performance
- Structured Tool Calls: Optimized for interacting with environments via structured commands, making it suitable for automated development workflows.
- Software Engineering Tasks: Demonstrates strong performance on code-related challenges, achieving a 42% pass@3 on the challenging SWEBench-100 benchmark.
- Context Window: Features a substantial 32K context window (24K input + 8K output), allowing it to handle complex and lengthy problem descriptions or codebases.
- RL Training: Leverages the
rloo-ntraining method with the BenSkyRL + Harbor framework, building upon thelaion/r2egym-nl2bash-stack-bugsseq-fixthink-againbase model.
Use Cases
This model is particularly well-suited for:
- Automated software development and bug fixing.
- Code generation and modification requiring tool interaction.
- Tasks involving command-line execution and file system manipulation.
- Research into RL-driven agents for complex, structured environments.