INTELLECT-2: A Reasoning Model for Math and Code
INTELLECT-2 is a 32 billion parameter language model from PrimeIntellect, distinguished by its training methodology: a distributed asynchronous reinforcement learning (RL) run utilizing globally contributed GPU resources. Built upon the qwen2 architecture, this model is specifically optimized for complex mathematical and coding challenges.
Key Capabilities & Features
- Reinforcement Learning Training: Trained with
prime-rl, a framework for distributed asynchronous RL using GRPO over verifiable rewards, enhancing stability. - Specialized Performance: Demonstrates improved scores over its base model, QwQ-32B, in mathematical benchmarks (AIME24, AIME25) and coding tasks (LiveCodeBench v5).
- Context Length: Features a substantial 131072 token context length.
- Length Control: Optimized for best results when prompted with "Think for 10000 tokens before giving a response," though other specific lengths (2000, 4000, 6000, 8000) also yield strong performance.
- Compatibility: Compatible with popular inference engines like vllm and sglang.
Use Cases & Considerations
INTELLECT-2 is particularly well-suited for applications requiring strong reasoning in mathematics and code generation/analysis. While excelling in these areas, its performance on general instruction following (IFEval) saw a slight decrease compared to its base model, suggesting a focused specialization. Developers should leverage its length control prompting for optimal output quality in its target domains. For more technical details, refer to the technical report.