Name: Thrillcrazyer/Qwen-2.5-1.5B_TAC_Teacher_Qwen14B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Thrillcrazyer

Model Overview

Thrillcrazyer/Qwen-2.5-1.5B_TAC_Teacher_Qwen14B is a 1.5 billion parameter language model derived from Qwen/Qwen2.5-1.5B-Instruct. Its primary distinction lies in its specialized fine-tuning on the DeepMath-103k dataset, a collection curated for mathematical reasoning tasks.

Training Methodology

The model was trained using the TRL library and incorporates the GRPO (Gradient-based Reward Policy Optimization) method. GRPO is a technique introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300), specifically designed to improve a model's mathematical reasoning abilities. This targeted training approach aims to enhance the model's performance on complex mathematical problems and logical deductions.

Key Capabilities

Enhanced Mathematical Reasoning: Specialized training on DeepMath-103k with GRPO focuses on improving the model's ability to understand and solve mathematical problems.
Instruction Following: Inherits instruction-following capabilities from its base model, Qwen2.5-1.5B-Instruct.
Context Handling: Supports a substantial context length of 32768 tokens, allowing for processing longer and more complex problem descriptions.

Use Cases

This model is particularly well-suited for applications requiring:

Solving mathematical equations and word problems.
Assisting in educational tools for math and logic.
Generating explanations for mathematical concepts.
Tasks that benefit from strong logical deduction and numerical understanding.

Overview

Model Overview

Training Methodology

Key Capabilities

Use Cases

Full Model Card (README)