SERA-32B: An Open-Source Coding Agent
SERA-32B, developed by the Allen Institute for AI (Ai2), is a 32-billion parameter open-source coding agent built upon the Qwen 3-32B base model. It is the inaugural model in Ai2's Open Coding Agents series, specifically designed for automated software engineering tasks. A key differentiator is its training methodology, Soft Verified Generation (SVG), which is significantly more cost-effective than traditional reinforcement learning or synthetic data methods, costing approximately $2,000 for data generation and training.
Key Capabilities
- High Performance on SWE-bench: Achieves 49.5% on SWE-bench Verified, demonstrating strong capabilities in resolving real-world software issues. This performance is comparable to frontier open models like Devstral-Small-2 (24B) and larger models such as GLM-4.5-Air (110B).
- Cost-Efficient Training: Utilizes Soft Verified Generation (SVG), a novel two-rollout pipeline that is 26x cheaper than reinforcement learning and 57x cheaper than previous synthetic data methods for equivalent performance.
- 32K Context Length: Evaluated at a 32K context length, enabling it to handle complex codebases and extensive problem descriptions.
- Apache 2.0 License: Available under an Apache 2.0 license, suitable for research, educational, and commercial use in accordance with Ai2's Responsible Use Guidelines.
Good For
- Automated Software Engineering: Ideal for tasks such as bug fixing, feature implementation, and code refactoring.
- Repository Specialization: Can be fine-tuned on private codebases to create highly specialized coding agents.
- Research: A valuable tool for studying coding agents, data generation methods, and agent behavior, particularly in the context of efficient training techniques.
Users should be aware that SERA-32B is a research artifact without safety filtering and may generate insecure or incorrect code, requiring human oversight and verification.