Qwen/Qwen2.5-Math-72B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:72.7BQuant:FP8Ctx Length:32kPublished:Sep 16, 2024License:qwenArchitecture:Transformer0.0K Warm

Qwen/Qwen2.5-Math-72B is a 72.7 billion parameter mathematical language model developed by Qwen, specifically designed for solving math problems in both English and Chinese. It supports Chain-of-Thought (CoT) and Tool-integrated Reasoning (TIR) for enhanced computational accuracy and algorithmic manipulation. This model is optimized for mathematical tasks and serves as a strong base for fine-tuning, offering significant performance improvements over its predecessor on mathematical benchmarks.

Loading preview...

Qwen2.5-Math-72B: Specialized Mathematical LLM

Qwen2.5-Math-72B is a 72.7 billion parameter base model from the Qwen2.5-Math series, developed by Qwen. This series represents an upgrade from the Qwen2-Math models, focusing on advanced mathematical problem-solving capabilities.

Key Capabilities

  • Mathematical Reasoning: Primarily designed for solving math problems in both English and Chinese.
  • Reasoning Methods: Supports two core reasoning approaches:
    • Chain-of-Thought (CoT): Enhances general reasoning capabilities.
    • Tool-integrated Reasoning (TIR): Improves computational accuracy, symbolic manipulation, and algorithmic reasoning, crucial for complex mathematical tasks like finding roots of equations or computing eigenvalues.
  • Performance: Achieves strong results on mathematical benchmarks, with the instruction-tuned variant (Qwen2.5-Math-72B-Instruct) scoring 87.8 on the MATH benchmark using TIR.
  • Base Model: This specific model (Qwen2.5-Math-72B) is a base model, suitable for completion and few-shot inference, and serves as an excellent starting point for further fine-tuning.

Good For

  • Mathematical Problem Solving: Ideal for applications requiring robust solutions to math problems.
  • Fine-tuning: Provides a powerful foundation for developers looking to fine-tune a model for specific mathematical domains or tasks.
  • Research in Mathematical LLMs: Useful for exploring and advancing techniques in mathematical reasoning with large language models.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p