laion/rl_r2egym-nl2bash-swesmith-pymethods2test_terminus-structured

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 23, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The laion/rl_r2egym-nl2bash-swesmith-pymethods2test_terminus-structured model is an 8 billion parameter Qwen3-based language model, developed by laion, specifically trained with Reinforcement Learning (RL) for structured tool calls. It excels at code generation and problem-solving tasks, demonstrating a 42% pass@3 on SWEBench-100 and 94-100% pass@8 on Pymethods2test. This model is optimized for agentic workflows, utilizing bash, view, edit, create, and search tools within a 32k token context window.

Loading preview...

Overview

This model, developed by laion, is an 8 billion parameter Qwen3-based language model that has undergone extensive Reinforcement Learning (RL) training. It is specifically designed for agentic problem-solving, integrating structured tool calls for tasks requiring interaction with environments. The training pipeline involved multiple RL stages, starting with supervised fine-tuning (SFT) on datasets like r2egym, nl2bash, and swesmith, followed by progressive RL training on mixed datasets, full r2egym, and pymethods2test.

Key Capabilities

  • Structured Tool Use: Trained with a terminus-structured agent, enabling the use of bash, view, edit, create, and search tools for complex interactions.
  • Code Generation & Problem Solving: Achieves a 42% pass@3 on SWEBench-100, an improvement over its base model's 37%, and 94-100% pass@8 on Pymethods2test.
  • Enhanced SWEBench Performance: Solved 14 SWEBench tasks that the base model could not.
  • Extended Context Window: Operates with a 32,768 token context length (24k input + 8k output).

Good For

  • Automated Software Engineering: Ideal for tasks requiring automated code fixes, refactoring, and testing.
  • Agentic Workflows: Suitable for applications where an LLM needs to interact with a system using defined tools.
  • Complex Code Generation: Excels in scenarios demanding high accuracy in generating and verifying code solutions.