Overview
OpenThinker-Agent-v1-SFT Overview
OpenThinker-Agent-v1-SFT is an 8 billion parameter model from OpenThoughts, serving as the supervised fine-tuning (SFT) stage of the OpenThinker-Agent-v1 series. It is built upon the Qwen3-8B architecture and features a 32,768 token context length. This model is specifically trained for agentic tasks, demonstrating capabilities in environments such as Terminal-Bench 2.0 and SWE-Bench.
Key Capabilities
- Agentic Task Performance: Optimized for tasks requiring autonomous problem-solving and execution, particularly in terminal and software engineering contexts.
- Supervised Fine-Tuning: This version is the result of the SFT stage, trained on the OpenThoughts-Agent-v1-SFT dataset, which includes approximately 15,200 traces from
nl2bashandInferredBugsdatasets. - Foundation for RL: It serves as the base model before further reinforcement learning (RL) optimization, with the fully RL-trained model available as OpenThinker-Agent-v1.
Good for
- Developers exploring agentic LLM capabilities.
- Research into supervised fine-tuning techniques for agent models.
- Applications requiring models proficient in shell command generation and bug identification/fixing.