Name: kangdawei/MMR-Sigmoid-DAPO-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kangdawei

Model Overview

The kangdawei/MMR-Sigmoid-DAPO-8B is an 8 billion parameter language model derived from the deepseek-ai/DeepSeek-R1-Distill-Llama-8B architecture. It has been specifically fine-tuned using the TRL library and the novel DAPO reinforcement learning method.

Key Characteristics

Base Model: Fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Llama-8B.
Training Method: Utilizes DAPO (DAPO: An Open-Source LLM Reinforcement Learning System at Scale), a reinforcement learning approach, for enhanced performance.
Dataset: Trained on the knoveleng/open-rs dataset, suggesting specialization in areas covered by this data.
Context Length: Supports a substantial context window of 32768 tokens.

Usage

This model is suitable for text generation tasks, particularly those benefiting from its specialized training and the DAPO optimization. Developers can integrate it using the Hugging Face transformers library, as demonstrated in the quick start example provided in the original model card.

Overview

Model Overview

Key Characteristics

Usage

Full Model Card (README)