kangdawei/MMR-Sigmoid-DAPO-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 18, 2025Architecture:Transformer Cold

The kangdawei/MMR-Sigmoid-DAPO-8B is an 8 billion parameter language model, fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Llama-8B. It was trained using the TRL library and the DAPO reinforcement learning method on the knoveleng/open-rs dataset. This model is optimized for generating responses based on its specialized training, offering a 32768 token context length.

Loading preview...