unsloth/DeepSeek-R1-Distill-Llama-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 20, 2025License:llama3.1Architecture:Transformer0.1K Warm

The DeepSeek-R1-Distill-Llama-8B is an 8 billion parameter language model developed by DeepSeek AI, distilled from the larger DeepSeek-R1 model and based on Llama-3.1-8B. It specializes in reasoning tasks, leveraging patterns learned from a 671B parameter model to achieve strong performance in math, code, and general reasoning. This model offers a 32,768 token context length and is designed to bring advanced reasoning capabilities to a smaller, more efficient architecture.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p