akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-SpeculativeReasoner is a 1.5 billion parameter language model fine-tuned by akhauriyash. It is based on the DeepSeek-R1-Distill-Qwen-1.5B architecture and specializes in speculative reasoning, particularly for mathematical tasks. The model leverages a 131072 token context length, making it suitable for complex problem-solving requiring extensive context.
Model Overview
This model, akhauriyash/DeepSeek-R1-Distill-Qwen-1.5B-SpeculativeReasoner, is a 1.5 billion parameter language model developed by akhauriyash. It is a fine-tuned version of the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model, specifically optimized for speculative reasoning.
Key Capabilities
- Speculative Reasoning: The model has been fine-tuned on the akhauriyash/OpenR1_Math_SpeculativeReasoning dataset, enhancing its ability to perform speculative reasoning, particularly in mathematical contexts.
- Extended Context Window: It supports a substantial context length of 131072 tokens, allowing it to process and reason over large inputs.
- Instruction Following: Trained using SFT (Supervised Fine-Tuning) with the TRL library, it is designed to follow instructions effectively.
Training Details
The model's training procedure involved Supervised Fine-Tuning (SFT) using the TRL framework. This process adapted the base DeepSeek-R1-Distill-Qwen-1.5B model to excel in tasks requiring speculative reasoning, as evidenced by its training on a specialized mathematical reasoning dataset.