open-thoughts/OpenThinker3-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 27, 2025License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

OpenThinker3-7B by open-thoughts is a 7.6 billion parameter reasoning model, fine-tuned from Qwen2.5-7B-Instruct on the OpenThoughts3-1.2M dataset. This model is specifically optimized for complex reasoning tasks across mathematics, code, and science, demonstrating strong performance against other 7B models. It features a substantial context length of 131072 tokens, making it suitable for detailed problem-solving and analytical applications.

Loading preview...

OpenThinker3-7B: A Specialized Reasoning Model

OpenThinker3-7B is a 7.6 billion parameter language model developed by open-thoughts, fine-tuned from the Qwen2.5-7B-Instruct architecture. It is specifically designed for advanced reasoning tasks, building upon the OpenThoughts3-1.2M dataset, which comprises 850,000 math, 250,000 code, and 100,000 science questions. This model represents a significant advancement over its predecessors, OpenThinker-7B and OpenThinker2-7B.

Key Capabilities

  • Enhanced Reasoning: Demonstrates superior performance in complex reasoning benchmarks, particularly in mathematics, coding, and scientific problem-solving.
  • Competitive Performance: Outperforms several other strong 7B reasoning models, including DeepSeek-R1-Distill-Qwen-7B and Llama-3.1-Nemotron-Nano-8B-v1, across various evaluation metrics.
  • Extensive Context Window: Features a large context length of 131072 tokens, enabling the processing of lengthy and intricate problem descriptions.
  • Data-Driven Improvement: Achieves its strong performance through a comprehensive data pipeline and over 1,000 ablation experiments, leading to the creation of its specialized training dataset.

Good For

  • Mathematical Problem Solving: Excels in AIME, AMC, MATH500, and JEEBench evaluations.
  • Code-Related Reasoning: Shows strong results in CodeElo and CodeForces benchmarks.
  • Scientific Inquiry: Performs well in GPQA-D and other science-related reasoning tasks.
  • Research and Development: Ideal for applications requiring robust analytical and logical deduction capabilities, especially in technical domains. For more details, refer to the OpenThoughts paper.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p