allenai/SERA-14B

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Feb 3, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

SERA-14B is a 14 billion parameter open-source coding agent developed by Allen Institute for AI (Ai2), part of their Open Coding Agents series. Built on a Qwen 3-14B base model with a 32K token context length, it is specifically fine-tuned for automated software engineering tasks like bug fixes and feature implementation. The model achieves 41.7% on SWE-bench Verified, demonstrating strong performance in resolving real-world software issues.

Loading preview...

SERA-14B: A Specialized Coding Agent

SERA-14B, developed by Allen Institute for AI (Ai2), is a 14 billion parameter open-source coding agent designed for automated software engineering. It is the fifth model in Ai2's Open Coding Agents series, built upon a Qwen 3-14B base model and fine-tuned using a GLM-4.6 teacher model.

Key Capabilities & Performance

  • High Performance on SWE-bench: Achieves 41.7% on SWE-bench Verified with a 32K context length, outperforming or matching several larger models in its class.
  • Specialized Training: Trained on 25,000 synthetic coding agent trajectories generated via Soft Verified Generation (SVG), which allows for data generation from any repository without test infrastructure.
  • Automated Software Engineering: Excels at tasks such as bug fixes, feature implementation, and code refactoring.

Intended Use Cases

  • Automated Software Engineering: Ideal for automating various development tasks within a codebase.
  • Repository Specialization: Can be fine-tuned on private codebases to create highly specialized coding agents.
  • Research: Useful for studying coding agents, data generation methods, and agent behavior.

Limitations

  • Primarily validated on SWE-bench Verified (Python repositories); performance on other languages is unknown.
  • May attempt to call a nonexistent submit tool, though the sera-cli handles this.
  • Performance is largely bounded by the GLM-4.6 teacher model's capabilities.
  • Like other LLMs, it may generate insecure or incorrect code, requiring human review and testing.