Name: suayptalha/DeepSeek-R1-Distill-Llama-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: suayptalha

Model Overview

DeepSeek-R1-Distill-Llama-3B is a 3.2 billion parameter language model, created by suayptalha. It is a distilled version of DeepSeek-R1, built upon the Llama-3.2-3B architecture and fine-tuned using the R1-Distill-SFT dataset. This model leverages the Llama3 chat template for instruction following and includes a suggested system prompt format for structured reasoning.

Key Capabilities

Instruction Following: Designed to respond effectively to user instructions, utilizing a Llama3-style chat template.
Reasoning: The suggested prompt format encourages a "think-then-answer" approach, as demonstrated in the example usage for logical problem-solving.
Compact Size: At 3.2 billion parameters, it offers a balance between performance and computational efficiency.

Performance Highlights

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 23.27. Specific scores include 70.93 on IFEval (0-Shot) and 21.45 on BBH (3-Shot), indicating its ability to handle various reasoning and instruction-based tasks.

Good For

Instruction-tuned applications: Suitable for tasks requiring the model to follow specific commands or formats.
Reasoning tasks: Benefits from the structured prompting approach to tackle problems requiring logical thought processes.
Resource-constrained environments: Its 3.2B parameter count makes it a viable option where larger models might be impractical.

Overview

Model Overview

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)