rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer Warm
The rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged model is a 1.5 billion parameter language model, fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B. It was specifically optimized on the GSM8K dataset using GRPO, indicating a focus on mathematical reasoning and problem-solving. With a context length of 32768 tokens, this model is designed for tasks requiring robust numerical and logical capabilities.
Loading preview...
Model Overview
This model, gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged, is a 1.5 billion parameter language model. It is a merged model, fine-tuned from the base deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B architecture.
Key Capabilities
- Mathematical Reasoning: The model has been specifically fine-tuned on the GSM8K dataset, which is a benchmark for grade school math word problems. This training indicates a strong focus on numerical and logical problem-solving.
- Optimization Method: The fine-tuning process utilized GRPO (Generalized Reinforcement Learning from Human Feedback with Policy Optimization), suggesting an advanced approach to enhance performance on specific tasks.
- Context Length: It supports a substantial context length of 32768 tokens, allowing it to process and generate longer sequences of text.
Good For
- Mathematical Problem Solving: Ideal for applications requiring the model to understand and solve arithmetic and word problems.
- Reasoning Tasks: Suitable for scenarios where logical deduction and step-by-step thinking are crucial.
- Integration: Easily loadable using the Hugging Face
transformerslibrary for quick deployment in Python projects.