kangdawei/DRA-GRPO-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Nov 20, 2025Architecture:Transformer Cold

The kangdawei/DRA-GRPO-7B model is a 7.6 billion parameter language model fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B. It leverages the GRPO training method, originally introduced for mathematical reasoning, and is specifically fine-tuned on the knoveleng/open-rs dataset. This model is designed for general text generation tasks, potentially excelling in areas related to its training data.

Loading preview...