AdKaLu/DeepSeek-R1-Distill-Llama-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 26, 2026License:mitArchitecture:Transformer Open Weights Cold

AdKaLu/DeepSeek-R1-Distill-Llama-8B is an 8 billion parameter language model, distilled from the DeepSeek-R1 reasoning model and based on Llama-3.1-8B, with a 32768 token context length. Developed by DeepSeek-AI, this model is specifically fine-tuned using reasoning data generated by the larger DeepSeek-R1 to enhance its mathematical, coding, and general reasoning capabilities. It aims to bring advanced reasoning patterns to smaller, more efficient models.

Loading preview...