Name: namezz/lvm-math-0402-a-qwen2.5-7b-instruct-b-qwen2.5-1.5b-instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: namezz

Model Overview

This model, namezz/lvm-math-0402-a-qwen2.5-7b-instruct-b-qwen2.5-1.5b-instruct, is a specialized version of the Qwen2.5-1.5B-Instruct architecture, featuring 1.5 billion parameters and a 32768-token context length. It has undergone fine-tuning specifically on the 7b_math_95k_16_train dataset, indicating a strong focus on mathematical reasoning and problem-solving capabilities.

Key Capabilities

Mathematical Proficiency: Fine-tuned on a dedicated math dataset, suggesting enhanced performance in numerical and mathematical tasks.
Optimized for Accuracy: Achieves a final loss of 0.0051 and a Token Mean Relative Error of 0.2861 on the evaluation set, indicating a focus on precision.
Qwen2.5 Base: Leverages the robust architecture of Qwen2.5-1.5B-Instruct, providing a solid foundation for instruction-following.

Training Details

The model was trained with a learning rate of 2e-05, a total batch size of 1024 (across 4 GPUs with gradient accumulation), and for 2 epochs. The training utilized the AdamW_TORCH_FUSED optimizer and a cosine learning rate scheduler with 50 warmup steps.

Good for

Applications requiring mathematical problem-solving.
Tasks where numerical accuracy is critical.
Use cases benefiting from a compact yet specialized instruction-tuned model.

Overview

Model Overview

Key Capabilities

Training Details

Good for

Full Model Card (README)