suayptalha/DeepSeek-R1-Distill-Llama-3B
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Feb 23, 2025License:llama3.2Architecture:Transformer0.0K Warm

DeepSeek-R1-Distill-Llama-3B is a 3.2 billion parameter causal language model developed by suayptalha. This model is a distilled version of DeepSeek-R1, fine-tuned on the Llama-3.2-3B architecture using the R1-Distill-SFT dataset. It is designed for general language generation tasks, demonstrating capabilities in reasoning and instruction following, as evidenced by its performance on various benchmarks.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p