open-thoughts/OpenThinker-32B

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Feb 12, 2025License:apache-2.0Architecture:Transformer0.2K Open Weights Warm

OpenThinker-32B is a 32.8 billion parameter instruction-tuned causal language model developed by open-thoughts, fine-tuned from Qwen2.5-32B-Instruct. It is specifically optimized for reasoning tasks, demonstrating strong performance on benchmarks like MATH500 and GPQA Diamond. This model excels in complex problem-solving and knowledge-intensive applications, leveraging its extensive 131072 token context length.

Loading preview...

OpenThinker-32B: A Reasoning-Optimized LLM

OpenThinker-32B is a 32.8 billion parameter language model developed by open-thoughts, built upon the Qwen2.5-32B-Instruct architecture. Its primary distinction lies in its fine-tuning on the proprietary OpenThoughts-114k dataset, which is derived from distilling DeepSeek-R1. This specialized training focuses on enhancing the model's reasoning capabilities, making it particularly adept at complex analytical tasks.

Key Capabilities & Performance

  • Enhanced Reasoning: OpenThinker-32B shows strong performance in reasoning benchmarks, achieving 90.6 on MATH500 and 61.6 on GPQA Diamond, outperforming several other 32B models in these specific metrics.
  • Extensive Context: The model supports a substantial context length of 131072 tokens, enabling it to process and understand large amounts of information for intricate problem-solving.
  • Open-Source Ecosystem: The project emphasizes transparency, providing open access to its model weights, datasets (OpenThoughts-114k), data generation code, evaluation code (Evalchemy), and training code (LlamaFactory).

Training Details

The model was fine-tuned for 3 epochs using a 16k context length on the OpenThoughts-114k dataset. Training involved significant computational resources, utilizing 8xH100 P5 nodes on AWS SageMaker for approximately 90 hours.

Ideal Use Cases

  • Complex Problem Solving: Suited for applications requiring deep analytical reasoning, such as mathematical problem-solving or scientific inquiry.
  • Knowledge-Intensive Tasks: Effective in scenarios demanding high accuracy in answering questions based on extensive knowledge, as indicated by its GPQA Diamond performance.
  • Research and Development: Its open-source nature and focus on reasoning make it a valuable tool for researchers exploring advanced AI capabilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p