rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-4_merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer Warm

The rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-4_merged model is a 1.5 billion parameter causal language model fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B. It was specifically optimized for mathematical reasoning tasks, particularly on the GSM8K dataset, using the GRPO method. With a context length of 32768 tokens, this model is designed for efficient performance in arithmetic and problem-solving applications.

Loading preview...

Model Overview

This model, gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-4_merged, is a 1.5 billion parameter language model derived from the DeepSeek-R1-Distill-Qwen-1.5B architecture. It has been further fine-tuned with a focus on enhancing its mathematical reasoning capabilities.

Key Characteristics

  • Base Model: Built upon deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
  • Fine-tuning: Optimized using the GRPO method on the GSM8K dataset, which is a benchmark for grade school math word problems.
  • Parameter Count: Features 1.5 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens.

Use Cases

This model is particularly well-suited for applications requiring:

  • Mathematical Problem Solving: Excels in tasks related to arithmetic and logical reasoning, as demonstrated by its fine-tuning on GSM8K.
  • Efficient Inference: Its 1.5B parameter size makes it suitable for scenarios where computational resources are a consideration, while still providing strong performance in its specialized domain.
  • Educational Tools: Can be integrated into systems designed to assist with or generate mathematical questions and solutions.