Overview
OpenThinker-Agent-v1 Overview
OpenThinker-Agent-v1, developed by OpenThoughts, is an 8 billion parameter model derived from Qwen3-8B, specifically engineered for agentic tasks. This model undergoes a two-stage training process: initial supervised fine-tuning (SFT) on the OpenThoughts-Agent-v1-SFT dataset, followed by reinforcement learning (RL) using the OpenThoughts-Agent-v1-RL dataset. The SFT dataset comprises approximately 15,200 traces from sources like nl2bash and InferredBugs, while the RL dataset contains around 720 tasks from nl2bash verified.
Key Capabilities and Performance
- Agentic Task Specialization: Optimized for complex agentic environments such as Terminal-Bench 2.0 and SWE-Bench.
- Enhanced Performance: Demonstrates significant improvements over its base model, Qwen3-8B, on agent benchmarks. For instance, it achieves 4.9 on Terminal-Bench 2.0 and 15.7 on SWE-Bench Verified, compared to Qwen3-8B's 0.0 and 0.7 respectively.
- Robust Training Data: Utilizes meticulously curated datasets, including a three-stage filtration pipeline to ensure data quality and stability for training.
Ideal Use Cases
- Automated Problem Solving: Suitable for applications requiring autonomous agents to solve problems in terminal or software development environments.
- Code Generation and Debugging: Excels in tasks related to code understanding, generation, and debugging, as indicated by its performance on SWE-Bench.
- Research and Development: A valuable resource for researchers and developers exploring advanced agentic AI systems and dataset curation techniques.