DeepSeek-R1-Distill-Llama-3B is a 3.2 billion parameter causal language model developed by suayptalha. This model is a distilled version of DeepSeek-R1, fine-tuned on the Llama-3.2-3B architecture using the R1-Distill-SFT dataset. It is designed for general language generation tasks, demonstrating capabilities in reasoning and instruction following, as evidenced by its performance on various benchmarks.
No reviews yet. Be the first to review!