Name: GAIR/daVinci-Dev-72B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: GAIR

Overview of daVinci-Dev-72B

daVinci-Dev-72B is a 72 billion parameter model from the daVinci-Dev family, developed by GAIR, focusing on agentic software engineering. It is built upon the Qwen2.5-Base architecture and undergoes a unique training methodology called agentic mid-training. This process incorporates agent-native data to bridge the gap between traditional pretraining data and the dynamic, feedback-rich environments encountered by real code agents.

Key Training Methodology

The model's training involves two primary types of trajectories:

Contextually-native trajectories (PR-derived): These are constructed from GitHub pull requests, preserving the full information flow from file discovery and context retrieval to sequential edits. This provides broad coverage and diversity in coding scenarios.
Environmentally-native trajectories (executable rollouts): Collected from real executable repositories, these trajectories capture authentic feedback loops from genuine tool and test outputs, including both passing and non-passing scenarios.

Performance and Capabilities

daVinci-Dev-72B demonstrates strong performance in software engineering tasks, achieving 58.5% Pass@1 on SWE-Bench Verified. This places it among the state-of-the-art for open training recipes within its model size, despite starting from a non-coder base model. The model also shows generalization gains on standard code benchmarks like HumanEval/EvalPlus and scientific reasoning benchmarks such as GPQA/SciBench.

Intended Use

This model is designed for use within agentic scaffolds like SWE-Agent for automated software development and bug fixing. It is also compatible with standard inference frameworks like Hugging Face Transformers and vLLM.

Overview

Overview of daVinci-Dev-72B

Key Training Methodology

Performance and Capabilities

Intended Use

Full Model Card (README)