kangdawei/DAPO-No-DS
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 7, 2025Architecture:Transformer Warm

DAPO-No-DS is a 1.5 billion parameter language model developed by kangdawei, fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B. It was trained using the DAPO reinforcement learning method on the knoveleng/open-rs dataset, featuring a 131072 token context length. This model is specifically optimized for tasks related to open-ended response generation, leveraging advanced RL techniques for improved conversational abilities.

Loading preview...