Name: asparius/Qwen2.5-1.5B-SPO-1ep-iter2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: asparius

Model Overview

This model, asparius/Qwen2.5-1.5B-SPO-1ep-iter2, is a specialized fine-tuned version of the Qwen2.5-1.5B base model. It has been trained with a focus on improving mathematical reasoning abilities, utilizing the DigitalLearningGmbH/MATH-lighteval dataset.

Key Characteristics

Base Model: Qwen/Qwen2.5-1.5B, a 1.5 billion parameter language model.
Training Method: Fine-tuned using GRPO (Gradient-based Reinforcement Learning with Policy Optimization), a method introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300).
Dataset: Specifically trained on the DigitalLearningGmbH/MATH-lighteval dataset, indicating an optimization for mathematical tasks.
Context Length: Features a substantial context window of 131072 tokens, beneficial for handling complex problems with extensive input.

Use Cases

This model is particularly well-suited for applications requiring:

Mathematical Reasoning: Excels in tasks that involve solving mathematical problems and understanding complex numerical relationships.
Educational Tools: Can be integrated into systems for generating math explanations, solving equations, or assisting with mathematical homework.
Research in Mathematical AI: Provides a strong baseline for further research and development in enhancing AI's mathematical capabilities.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)