open-thoughts/OpenThinker-Agent-v1

Warm
Public
8B
FP8
32768
Dec 5, 2025
License: apache-2.0
Hugging Face
Overview

OpenThinker-Agent-v1 Overview

OpenThinker-Agent-v1, developed by OpenThoughts, is an 8 billion parameter model derived from Qwen3-8B, specifically engineered for agentic tasks. This model undergoes a two-stage training process: initial supervised fine-tuning (SFT) on the OpenThoughts-Agent-v1-SFT dataset, followed by reinforcement learning (RL) using the OpenThoughts-Agent-v1-RL dataset. The SFT dataset comprises approximately 15,200 traces from sources like nl2bash and InferredBugs, while the RL dataset contains around 720 tasks from nl2bash verified.

Key Capabilities and Performance

  • Agentic Task Specialization: Optimized for complex agentic environments such as Terminal-Bench 2.0 and SWE-Bench.
  • Enhanced Performance: Demonstrates significant improvements over its base model, Qwen3-8B, on agent benchmarks. For instance, it achieves 4.9 on Terminal-Bench 2.0 and 15.7 on SWE-Bench Verified, compared to Qwen3-8B's 0.0 and 0.7 respectively.
  • Robust Training Data: Utilizes meticulously curated datasets, including a three-stage filtration pipeline to ensure data quality and stability for training.

Ideal Use Cases

  • Automated Problem Solving: Suitable for applications requiring autonomous agents to solve problems in terminal or software development environments.
  • Code Generation and Debugging: Excels in tasks related to code understanding, generation, and debugging, as indicated by its performance on SWE-Bench.
  • Research and Development: A valuable resource for researchers and developers exploring advanced agentic AI systems and dataset curation techniques.