akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-SpeculativeReasoner

Warm
Public
1.5B
BF16
32768
1
Apr 16, 2025
Hugging Face

akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-SpeculativeReasoner is a 1.5 billion parameter language model fine-tuned by akhauriyash. It is based on the DeepSeek-R1-Distill-Qwen-1.5B architecture and specializes in speculative reasoning, particularly for mathematical tasks. The model leverages a 131072 token context length, making it suitable for complex problem-solving requiring extensive context.

Overview

Model Overview

This model, akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-SpeculativeReasoner, is a 1.5 billion parameter language model developed by akhauriyash. It is a fine-tuned version of the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model, specifically optimized for speculative reasoning.

Key Capabilities

  • Speculative Reasoning: The model has been fine-tuned on the akhauriyash/OpenR1_Math_SpeculativeReasoning dataset, enhancing its ability to perform speculative reasoning, particularly in mathematical contexts.
  • Extended Context Window: It supports a substantial context length of 131072 tokens, allowing it to process and reason over large inputs.
  • Instruction Following: Trained using SFT (Supervised Fine-Tuning) with the TRL library, it is designed to follow instructions effectively.

Training Details

The model's training procedure involved Supervised Fine-Tuning (SFT) using the TRL framework. This process adapted the base DeepSeek-R1-Distill-Qwen-1.5B model to excel in tasks requiring speculative reasoning, as evidenced by its training on a specialized mathematical reasoning dataset.