Name: kangdawei/DAPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kangdawei

Model Overview

kangdawei/DAPO is a 1.5 billion parameter language model, specifically a fine-tuned variant of the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B architecture. This model distinguishes itself through its training methodology, utilizing the DAPO (Deep Reinforcement Learning from Human Feedback) method, as detailed in the paper "DAPO: An Open-Source LLM Reinforcement Learning System at Scale" (arXiv:2503.14476). The training leveraged the knoveleng/open-rs dataset, indicating a focus on generating responses for open-ended or conversational scenarios.

Key Capabilities

Reinforcement Learning Optimization: Benefits from the DAPO method, suggesting enhanced conversational quality and alignment with human preferences.
Fine-tuned for Open-ended Responses: Specifically trained on the knoveleng/open-rs dataset, making it suitable for generating creative and diverse answers to complex prompts.
Efficient Size: At 1.5 billion parameters, it offers a balance between performance and computational efficiency, making it accessible for various applications.

Good For

Conversational AI: Generating engaging and coherent dialogue.
Creative Text Generation: Crafting imaginative responses to abstract or philosophical questions.
Prototyping and Research: Exploring the impact of DAPO on smaller, yet capable, language models.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)