Overview
GAIR/daVinci-Dev-72B-MT is a 72.7 billion parameter model from the daVinci-Dev family, specialized for agentic software engineering. It focuses on agentic mid-training using novel agent-native data to bridge the gap between static pretraining and dynamic coding environments. This model is the mid-training checkpoint, before Supervised Fine-Tuning (SFT).
Key Training & Data
- Agent-native mid-training: Utilizes two types of trajectories:
- Contextually-native (PR-derived): 68.6 billion tokens from GitHub pull requests, preserving full information flow including context retrieval and sequential edits.
- Environmentally-native (executable rollouts): 3.1 billion raw tokens (4.5 billion effective) collected from real executable repositories with genuine tool/test outputs, capturing authentic feedback loops.
- Starts from the
Qwen2.5 base model family.
Key Results
- Achieves 58.5% Pass@1 on SWE-Bench Verified with
daVinci-Dev-72B (the SFT version), demonstrating state-of-the-art performance among open training recipes for agentic scaffolds. This indicates strong generalization capabilities, even on standard code benchmarks like HumanEval/EvalPlus and scientific reasoning benchmarks.
Good For
- Developers building software engineering agents that require robust interaction with dynamic coding environments.
- Tasks involving code generation, debugging, and automated software development within a feedback loop.
- Integration with frameworks like SWE-Agent for complex software tasks.