open-thoughts/OpenThinker3-7B

Warm
Public
7.6B
FP8
131072
License: apache-2.0
Hugging Face
Overview

OpenThinker3-7B: A Specialized Reasoning Model

OpenThinker3-7B is a 7.6 billion parameter language model developed by open-thoughts, fine-tuned from the Qwen2.5-7B-Instruct architecture. It is specifically designed for advanced reasoning tasks, building upon the OpenThoughts3-1.2M dataset, which comprises 850,000 math, 250,000 code, and 100,000 science questions. This model represents a significant advancement over its predecessors, OpenThinker-7B and OpenThinker2-7B.

Key Capabilities

  • Enhanced Reasoning: Demonstrates superior performance in complex reasoning benchmarks, particularly in mathematics, coding, and scientific problem-solving.
  • Competitive Performance: Outperforms several other strong 7B reasoning models, including DeepSeek-R1-Distill-Qwen-7B and Llama-3.1-Nemotron-Nano-8B-v1, across various evaluation metrics.
  • Extensive Context Window: Features a large context length of 131072 tokens, enabling the processing of lengthy and intricate problem descriptions.
  • Data-Driven Improvement: Achieves its strong performance through a comprehensive data pipeline and over 1,000 ablation experiments, leading to the creation of its specialized training dataset.

Good For

  • Mathematical Problem Solving: Excels in AIME, AMC, MATH500, and JEEBench evaluations.
  • Code-Related Reasoning: Shows strong results in CodeElo and CodeForces benchmarks.
  • Scientific Inquiry: Performs well in GPQA-D and other science-related reasoning tasks.
  • Research and Development: Ideal for applications requiring robust analytical and logical deduction capabilities, especially in technical domains. For more details, refer to the OpenThoughts paper.