unsloth/QwQ-32B

Warm
Public
32B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Overview

QwQ-32B is a 32.5 billion parameter causal language model developed by Qwen, focusing on advanced reasoning. It is built on a transformer architecture incorporating RoPE, SwiGLU, RMSNorm, and Attention QKV bias. The model undergoes both pretraining and post-training, including Supervised Finetuning and Reinforcement Learning, to significantly enhance its performance on challenging downstream tasks.

Key Capabilities

  • Enhanced Reasoning: Specifically designed to excel in reasoning tasks, offering competitive performance against models like DeepSeek-R1 and o1-mini.
  • Extended Context Window: Supports a full context length of 131,072 tokens, with YaRN enabling effective handling of inputs exceeding 8,192 tokens.
  • Optimized for Problem Solving: Capable of thoughtful output generation, with recommended guidelines for sampling parameters (Temperature=0.6, TopP=0.95, TopK 20-40) and output standardization for math and multiple-choice questions.

Good For

  • Complex Reasoning Tasks: Ideal for applications requiring deep logical inference and problem-solving.
  • Long Context Processing: Suitable for scenarios where processing extensive documents or conversations is necessary, leveraging its 131,072-token context window.
  • Benchmarking and Research: Provides a robust foundation for evaluating and developing advanced reasoning AI systems.