rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer Warm

The rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged model is a 1.5 billion parameter language model, fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B. It was specifically optimized on the GSM8K dataset using GRPO, indicating a focus on mathematical reasoning and problem-solving. With a context length of 32768 tokens, this model is designed for tasks requiring robust numerical and logical capabilities.

Loading preview...

Model Overview

This model, gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-16_merged, is a 1.5 billion parameter language model. It is a merged model, fine-tuned from the base deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B architecture.

Key Capabilities

  • Mathematical Reasoning: The model has been specifically fine-tuned on the GSM8K dataset, which is a benchmark for grade school math word problems. This training indicates a strong focus on numerical and logical problem-solving.
  • Optimization Method: The fine-tuning process utilized GRPO (Generalized Reinforcement Learning from Human Feedback with Policy Optimization), suggesting an advanced approach to enhance performance on specific tasks.
  • Context Length: It supports a substantial context length of 32768 tokens, allowing it to process and generate longer sequences of text.

Good For

  • Mathematical Problem Solving: Ideal for applications requiring the model to understand and solve arithmetic and word problems.
  • Reasoning Tasks: Suitable for scenarios where logical deduction and step-by-step thinking are crucial.
  • Integration: Easily loadable using the Hugging Face transformers library for quick deployment in Python projects.