allenai/SERA-32B-GA

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 27, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

SERA-32B-GA is a 32 billion parameter open-source coding agent developed by Allen Institute for AI (Ai2), built upon the Qwen 3-32B base model. It is fine-tuned using synthetic agent trajectories and achieves 46.6% on SWE-bench Verified, making it a top-performing open-source model for automated software engineering tasks. This model is optimized for bug fixes, feature implementation, and refactoring within a 32K token context length.

Loading preview...

SERA-32B-GA: A Leading Open-Source Coding Agent

SERA-32B-GA is the second model in Allen Institute for AI's (Ai2) Open Coding Agents series, designed for automated software engineering. Built on the Qwen 3-32B base model and leveraging GLM-4.5-Air as a teacher, this 32 billion parameter model demonstrates strong performance in code generation and modification tasks.

Key Capabilities & Performance

  • High SWE-bench Performance: Achieves 46.6% on SWE-bench Verified, positioning it as one of the strongest open-source coding agents, second only to SERA-32B.
  • 32K Context Length: Supports a substantial context window for handling complex codebases and tasks.
  • Synthetic Trajectory Training: Trained on 16,000 synthetic coding agent trajectories generated via Soft Verified Generation (SVG), a method that removes the need for test infrastructure during data creation.
  • CLI Integration: Easily accessible via the sera CLI for quick deployment and integration with existing workflows.

Intended Use Cases

  • Automated Software Engineering: Ideal for tasks such as bug fixing, implementing new features, and code refactoring.
  • Repository Specialization: Can be fine-tuned on private codebases to create highly specialized coding agents.
  • Research: Valuable for studying coding agents, data generation methodologies, and agent behavior.

Limitations

  • Primarily validated on SWE-bench Verified (Python repositories); performance on other languages or benchmarks is not guaranteed.
  • Performance is largely bounded by the capabilities of its teacher model, GLM-4.5-Air.
  • May generate insecure or incorrect code, requiring human review and testing before deployment.