Name: mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mobiuslabsgmbh

Model Overview

The mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0 is a 1.5 billion parameter language model that has undergone a re-distillation process from its predecessor, the DeepSeek-R1-Distill-Qwen-1.5B model. This re-distillation aims to enhance overall performance and efficiency.

Key Performance Improvements

This re-distilled version shows notable gains across several benchmarks compared to its base model:

ARC (25-shot): Improved from 40.96 to 41.3
HellaSwag (10-shot): Increased from 44 to 45.22
MMLU (5-shot): Saw a significant jump from 39.27 to 42.01
TruthfulQA-MC2: Rose from 45.17 to 46.64
Winogrande (5-shot): Improved from 55.49 to 56.75
GSM8K (5-shot): Demonstrated strong improvement from 69.9 to 73.24
Average Score: Overall average increased from 49.13 to 50.86

Additional improvements were observed in GPQA (0-shot), MMLU PRO (5-shot), and IfEval (0-shot), indicating a more robust and capable model for various reasoning and knowledge-based tasks. The model maintains a substantial context length of 131072 tokens.

Ideal Use Cases

Given its enhanced benchmark performance, this model is well-suited for applications requiring:

General Language Understanding: Tasks involving comprehension and generation of text.
Mathematical Reasoning: Demonstrated by its strong GSM8K score, making it useful for arithmetic and logical problem-solving.
Question Answering: Improved scores on MMLU and TruthfulQA suggest better factual recall and response accuracy.
Long Context Processing: Its 131072 token context window allows for handling extensive documents or conversations.