kangdawei/DAPO-No-DS-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 7, 2025Architecture:Transformer Cold

The kangdawei/DAPO-No-DS-8B model is an 8 billion parameter language model fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Llama-8B. It was trained using the DAPO (Deep Reinforcement Learning from Human Feedback) method on the knoveleng/open-rs dataset, specializing in conversational response generation. With a context length of 32768 tokens, this model is optimized for generating nuanced and contextually relevant text responses.

Loading preview...