fhai50032/Qwen2.5-GRPO-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 7, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

fhai50032/Qwen2.5-GRPO-7B is a 7.6 billion parameter causal language model developed by fhai50032, fine-tuned from unsloth/Qwen2.5-7B-Instruct-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language generation tasks, leveraging its efficient training methodology to provide a performant base for various applications.

Loading preview...