kangdawei/MMR-DAPO
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 7, 2025Architecture:Transformer Warm

MMR-DAPO by kangdawei is a 1.5 billion parameter language model, fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. It was trained using the DAPO reinforcement learning method on the knoveleng/open-rs dataset, featuring a substantial 131072-token context length. This model is optimized for generating responses based on its specialized training, making it suitable for conversational AI and text generation tasks.

Loading preview...