open-thoughts/OpenThinker-Agent-v1-SFT

Warm
Public
8B
FP8
32768
License: apache-2.0
Hugging Face
Overview

OpenThinker-Agent-v1-SFT Overview

OpenThinker-Agent-v1-SFT is an 8 billion parameter model from OpenThoughts, serving as the supervised fine-tuning (SFT) stage of the OpenThinker-Agent-v1 series. It is built upon the Qwen3-8B architecture and features a 32,768 token context length. This model is specifically trained for agentic tasks, demonstrating capabilities in environments such as Terminal-Bench 2.0 and SWE-Bench.

Key Capabilities

  • Agentic Task Performance: Optimized for tasks requiring autonomous problem-solving and execution, particularly in terminal and software engineering contexts.
  • Supervised Fine-Tuning: This version is the result of the SFT stage, trained on the OpenThoughts-Agent-v1-SFT dataset, which includes approximately 15,200 traces from nl2bash and InferredBugs datasets.
  • Foundation for RL: It serves as the base model before further reinforcement learning (RL) optimization, with the fully RL-trained model available as OpenThinker-Agent-v1.

Good for

  • Developers exploring agentic LLM capabilities.
  • Research into supervised fine-tuning techniques for agent models.
  • Applications requiring models proficient in shell command generation and bug identification/fixing.