Overview of daVinci-Dev-72B
daVinci-Dev-72B is a 72 billion parameter model from the daVinci-Dev family, developed by GAIR, focusing on agentic software engineering. It is built upon the Qwen2.5-Base architecture and undergoes a unique training methodology called agentic mid-training. This process incorporates agent-native data to bridge the gap between traditional pretraining data and the dynamic, feedback-rich environments encountered by real code agents.
Key Training Methodology
The model's training involves two primary types of trajectories:
- Contextually-native trajectories (PR-derived): These are constructed from GitHub pull requests, preserving the full information flow from file discovery and context retrieval to sequential edits. This provides broad coverage and diversity in coding scenarios.
- Environmentally-native trajectories (executable rollouts): Collected from real executable repositories, these trajectories capture authentic feedback loops from genuine tool and test outputs, including both passing and non-passing scenarios.
Performance and Capabilities
daVinci-Dev-72B demonstrates strong performance in software engineering tasks, achieving 58.5% Pass@1 on SWE-Bench Verified. This places it among the state-of-the-art for open training recipes within its model size, despite starting from a non-coder base model. The model also shows generalization gains on standard code benchmarks like HumanEval/EvalPlus and scientific reasoning benchmarks such as GPQA/SciBench.
Intended Use
This model is designed for use within agentic scaffolds like SWE-Agent for automated software development and bug fixing. It is also compatible with standard inference frameworks like Hugging Face Transformers and vLLM.