Gen-Verse/ReasonFlux-F1
Gen-Verse/ReasonFlux-F1 is a 32.8 billion parameter large language model developed by Gen-Verse, specifically fine-tuned for advanced reasoning tasks. It leverages a template-augmented reasoning paradigm to achieve state-of-the-art performance on complex mathematical and general reasoning benchmarks. With a context length of 131072 tokens, ReasonFlux-F1 excels in areas like competitive mathematics (AIME) and challenging question answering (GPQA-Diamond). This model is optimized for scenarios requiring deep logical inference and problem-solving capabilities.
Loading preview...
ReasonFlux-F1: Advanced Reasoning LLM
ReasonFlux-F1-32B is a 32.8 billion parameter language model developed by Gen-Verse, specifically engineered for superior reasoning performance. It utilizes a novel template-augmented reasoning paradigm, building upon the methodologies introduced in its predecessor, ReasonFlux-Zero.
Key Capabilities & Performance
This model demonstrates state-of-the-art results across a range of challenging reasoning benchmarks, outperforming other 32B-class models like R1-Distill-32B, o1-mini, and LIMO-32B. Key performance highlights include:
- MATH500: Achieves 96.0% pass@1.
- AIME 2024: Scores 76.7% pass@1.
- AIME 2025: Scores 53.3% pass@1.
- GPQA-Diamond: Achieves 67.2% pass@1.
These results highlight its strong capabilities in complex mathematical problem-solving and general knowledge-based reasoning. The model's development is detailed in the paper "ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates" (arXiv:2502.06772).
Ideal Use Cases
ReasonFlux-F1-32B is particularly well-suited for applications requiring:
- Advanced Mathematical Reasoning: Solving intricate math problems, including those found in competitive programming contexts.
- Complex Question Answering: Tackling challenging, multi-step reasoning questions.
- Logical Inference: Scenarios where deep understanding and logical deduction are paramount.
Developers can integrate ReasonFlux-F1 using VLLM for efficient inference, as demonstrated in the provided quick-start example.