daman1209arora/alpha_0.2_DeepSeek-R1-Distill-Qwen-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 13, 2025Architecture:Transformer Cold

The daman1209arora/alpha_0.2_DeepSeek-R1-Distill-Qwen-7B is a 7.6 billion parameter language model with a 32768 token context length. This model is a distilled version, likely optimized for efficiency and performance from a larger DeepSeek-R1 and Qwen-7B base. Its primary use case is general language understanding and generation, leveraging its substantial context window for complex tasks.

Loading preview...