daman1209arora/alpha_0.4_DeepSeek-R1-Distill-Qwen-1.5B
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 13, 2025Architecture:Transformer Warm
The daman1209arora/alpha_0.4_DeepSeek-R1-Distill-Qwen-1.5B is a 1.5 billion parameter language model, likely a distilled version combining elements from DeepSeek-R1 and Qwen-1.5B architectures. With a substantial context length of 131072 tokens, it is designed for tasks requiring extensive contextual understanding. Its primary differentiator is the large context window for its size, making it suitable for processing and generating long-form content or complex documents.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–