daman1209arora/alpha_0.4_DeepSeek-R1-Distill-Qwen-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 13, 2025Architecture:Transformer Cold

The daman1209arora/alpha_0.4_DeepSeek-R1-Distill-Qwen-7B is a 7.6 billion parameter language model with a substantial context length of 131072 tokens. This model is a distilled version, likely leveraging the strengths of both DeepSeek-R1 and Qwen architectures. Its primary use case is general language understanding and generation, benefiting from its large context window for complex tasks.

Loading preview...