OpenThinker-Agent-v1 Overview

OpenThinker-Agent-v1, developed by OpenThoughts, is an 8 billion parameter model derived from Qwen3-8B, specifically engineered for agentic tasks. This model undergoes a two-stage training process: initial supervised fine-tuning (SFT) on the OpenThoughts-Agent-v1-SFT dataset, followed by reinforcement learning (RL) using the OpenThoughts-Agent-v1-RL dataset. The SFT dataset comprises approximately 15,200 traces from sources like nl2bash and InferredBugs, while the RL dataset contains around 720 tasks from nl2bash verified.

Key Capabilities and Performance

Agentic Task Specialization: Optimized for complex agentic environments such as Terminal-Bench 2.0 and SWE-Bench.
Enhanced Performance: Demonstrates significant improvements over its base model, Qwen3-8B, on agent benchmarks. For instance, it achieves 4.9 on Terminal-Bench 2.0 and 15.7 on SWE-Bench Verified, compared to Qwen3-8B's 0.0 and 0.7 respectively.
Robust Training Data: Utilizes meticulously curated datasets, including a three-stage filtration pipeline to ensure data quality and stability for training.

Ideal Use Cases

Automated Problem Solving: Suitable for applications requiring autonomous agents to solve problems in terminal or software development environments.
Code Generation and Debugging: Excels in tasks related to code understanding, generation, and debugging, as indicated by its performance on SWE-Bench.
Research and Development: A valuable resource for researchers and developers exploring advanced agentic AI systems and dataset curation techniques.