Name: trnqphu/deepseek-r1-4b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: trnqphu

Model Overview

trnqphu/deepseek-r1-4b is a 1.5 billion parameter language model, fine-tuned from the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model. It utilizes a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating coherent, extended responses. The model's training involved Supervised Fine-Tuning (SFT) using the TRL framework, specifically version 0.16.1, with Transformers 4.51.3 and PyTorch 2.6.0.

Key Capabilities

General Text Generation: Capable of generating human-like text based on given prompts.
Long Context Understanding: Benefits from its 32768 token context window, allowing for more comprehensive understanding and generation in longer conversations or documents.
Fine-tuned Performance: Leverages SFT to enhance its performance for various language tasks.

Use Cases

This model is well-suited for applications requiring:

Conversational AI: Engaging in extended dialogues.
Content Creation: Generating articles, stories, or other forms of written content.
Question Answering: Providing detailed answers based on extensive context.

Developers can quickly integrate and experiment with this model using the transformers library, as demonstrated in the provided quick start example.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)