Name: modaserMoj/csc415-phase1-0.5b-fast API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: modaserMoj

Overview

modaserMoj/csc415-phase1-0.5b-fast is a 0.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-0.5B base model. It leverages a substantial context length of 32768 tokens, allowing it to process extensive inputs and maintain coherence over long conversations or documents. The model's training utilized the TRL framework.

Key Capabilities

Enhanced Mathematical Reasoning: This model was specifically trained using the GRPO (Gradient-based Reasoning Policy Optimization) method, as introduced in the DeepSeekMath paper. This training approach aims to significantly improve its performance on mathematical reasoning tasks.
Long Context Understanding: With a 32768 token context window, the model can handle complex queries and generate detailed responses that require understanding of large amounts of information.

Training Details

The model's fine-tuning process incorporated GRPO, a technique designed to push the limits of mathematical reasoning in open language models. The training environment included TRL 0.29.0, Transformers 5.3.0, Pytorch 2.10.0+cu128, Datasets 4.6.1, and Tokenizers 0.22.2.

Good For

Applications requiring strong mathematical problem-solving.
Tasks benefiting from processing and generating long sequences of text.
Research and development in mathematical AI.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)