Name: leonMW/DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: leonMW

Model Overview

This model, leonMW/DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic, is a 1.5 billion parameter language model derived from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. It features a substantial context length of 32768 tokens, making it suitable for processing longer inputs.

Key Training Details

The primary differentiator for this model is its training methodology. It was fine-tuned using GRPO (Generalized Reinforcement Learning from Policy Optimization), a method detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This specialized training aims to improve the model's performance in complex reasoning tasks, particularly those involving mathematics.

Use Cases

Given its GRPO-based training, this model is particularly well-suited for:

Mathematical reasoning tasks: Benefiting from the DeepSeekMath-derived training approach.
Complex problem-solving: Where logical deduction and structured thinking are required.
Applications requiring deep understanding of context: Due to its large 32768 token context window.

Frameworks Used

The model's training leveraged several key frameworks, including TRL (version 0.23.1), Transformers (version 4.57.1), Pytorch (version 2.8.0), Datasets (version 4.4.1), and Tokenizers (version 0.22.1).

Overview

Model Overview

Key Training Details

Use Cases

Frameworks Used

Full Model Card (README)