OpenSWE-72B: Advanced Software Engineering Agent
OpenSWE-72B, developed by GAIR, is a 72.7 billion parameter model specifically designed for Software Engineering (SWE) tasks. It leverages the OpenSWE framework, which is the largest fully transparent environment synthesis framework for SWE agent training, featuring 45,320 executable Docker environments from over 12.8k repositories. This framework includes open-sourced Dockerfiles, evaluation scripts, and a distributed multi-agent synthesis pipeline, ensuring high reproducibility and extensibility.
Key Capabilities
- Unprecedented Scale and Transparency: Built on 45,320 executable environments, with full infrastructure transparency for reproducibility and community contributions.
- Quality-Centric Data Curation: Employs a sophisticated filtering pipeline to characterize environment difficulty, removing unsolvable or trivially simple instances and retaining high-quality, challenging tasks. This process involved significant investment in trajectory sampling and curation, yielding approximately 13,000 curated trajectories from 9,000 quality-guaranteed environments.
- State-of-the-Art Performance: Achieves 66.0% on SWE-bench Verified (Pass@1) using the SWE-Agent scaffold, setting a new benchmark among SFT-based methods in the Qwen2.5 series. OpenSWE-trained models consistently outperform alternatives like SWE-rebench across various scales and scaffolds.
- Broad Generalization: Demonstrates substantial out-of-domain improvements in areas like code (e.g., HumanEval +29), math (e.g., MATH-500 +12.2 for 72B), and science benchmarks, without compromising factual recall.
Good for
- Developing and training advanced SWE agents.
- Researching and improving automated software development and bug-fixing systems.
- Tasks requiring high-performance code generation and problem-solving within complex software environments.
- Applications benefiting from robust, reproducible, and scalable SWE environment synthesis.