kangdawei/MMR-Sigmoid-DAPO-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Dec 18, 2025Architecture:Transformer Cold

kangdawei/MMR-Sigmoid-DAPO-7B is a 7.6 billion parameter language model fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B. It was trained using the DAPO reinforcement learning method on the knoveleng/open-rs dataset, featuring a 131072 token context length. This model is optimized for specific tasks related to its training data, offering enhanced performance in areas where DAPO fine-tuning is beneficial.

Loading preview...