Shiyu-Lab/DeepSeek-R1-Distill-Qwen-1.5B-thinkprune-iter2k
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 8, 2025Architecture:Transformer0.0K Warm
The Shiyu-Lab/DeepSeek-R1-Distill-Qwen-1.5B-thinkprune-iter2k is a 1.5 billion parameter language model developed by Shiyu-Lab. This model is a distilled version, likely optimized for efficiency and specific task performance, building upon the Qwen architecture. With a substantial context length of 131072 tokens, it is designed to handle extensive input sequences. Its distillation and pruning suggest a focus on maintaining strong performance within a smaller footprint, making it suitable for applications requiring efficient inference with deep contextual understanding.
Loading preview...