Name: HenryShan/Qwen2.5-Math-7B-DPO-10K API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HenryShan

Overview

HenryShan/Qwen2.5-Math-7B-DPO-10K is a 7.6 billion parameter model, fine-tuned from the Qwen2.5-Math-7B base model. Its primary focus is on mathematical reasoning, specifically designed to provide step-by-step solutions to complex math problems.

Key Capabilities

Specialized Mathematical Reasoning: Excels in generating detailed solutions for problems in algebra, calculus, and geometry.
Direct Preference Optimization (DPO): Fine-tuned using DPO on the Math-Step-DPO-10K dataset to enhance the quality and clarity of its mathematical explanations.
Parameter-Efficient Fine-tuning: Utilizes LoRA (Low-Rank Adaptation) with specific configurations (Rank: 8, Alpha: 10, Dropout: 0) for efficient adaptation.
Apple Silicon Compatibility: The fine-tuning process was conducted using mlx_lm.lora on Apple Silicon Mac hardware, indicating potential optimization for this ecosystem.

Good For

Applications requiring detailed, step-by-step mathematical problem-solving.
Educational tools or platforms that need to explain mathematical concepts and solutions.
Research into advanced mathematical reasoning capabilities of large language models.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)