harheem/qwen2.5-3b-grpo-ko-200-16bit
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 19, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The harheem/qwen2.5-3b-grpo-ko-200-16bit model is a 3.1 billion parameter Qwen2.5-based causal language model developed by harheem. It was fine-tuned from unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit using Unsloth and Huggingface's TRL library, enabling 2x faster training. This model is optimized for efficient performance and is suitable for applications requiring a compact yet capable language model.

Loading preview...