Name: mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mobiuslabsgmbh

Model Overview

mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1 is a 1.5 billion parameter language model that has been re-distilled from the original DeepSeek-R1-Distill-Qwen-1.5B. This re-distillation process aims to enhance the model's overall performance and accuracy across a range of benchmarks.

Key Performance Improvements

This version shows notable improvements compared to its predecessor, DeepSeek-R1-Distill-Qwen-1.5B, across several critical benchmarks:

ARC (25-shot): Improved from 40.96 to 41.55
HellaSwag (10-shot): Increased from 44 to 45.88
MMLU (5-shot): Rose from 39.27 to 41.82
TruthfulQA-MC2: Enhanced from 45.17 to 46.63
Winogrande (5-shot): Grew from 55.49 to 57.7
GSM8K (5-shot): Significantly improved from 69.9 to 74.3
Average: Overall average score increased from 49.13 to 51.31

Further improvements are also observed in more challenging benchmarks:

MMLU PRO (5-shot): Improved from 16.74 to 19.86
BBH (3-shot): Increased from 35.12 to 37.23

Use Cases

This model is particularly well-suited for applications requiring:

Enhanced reasoning capabilities: Demonstrated by improvements in ARC, MMLU, and BBH.
Accurate question answering: Benefiting from better performance in TruthfulQA and MMLU.
Mathematical problem-solving: Evidenced by the significant gain in GSM8K scores.

Developers can integrate this model using the provided Hugging Face Transformers library, leveraging its improved performance for various natural language processing tasks.