vinhnx90/vt-qwen-3b-GRPO-merged-16bit

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 23, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

The vinhnx90/vt-qwen-3b-GRPO-merged-16bit is a 3.1 billion parameter Qwen2.5-3B-Instruct model developed by vinhnx90. This model was fine-tuned using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general instruction-following tasks, leveraging its efficient training methodology.

Loading preview...

Model Overview

The vinhnx90/vt-qwen-3b-GRPO-merged-16bit is a 3.1 billion parameter language model based on the Qwen2.5-3B-Instruct architecture. Developed by vinhnx90, this model was fine-tuned from unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit.

Key Characteristics

  • Efficient Training: This model was trained significantly faster, achieving 2x speed improvements, by utilizing the Unsloth library in conjunction with Huggingface's TRL library.
  • Architecture: It leverages the Qwen2.5-3B-Instruct base, known for its instruction-following capabilities.
  • Parameter Count: With 3.1 billion parameters, it offers a balance between performance and computational efficiency.

Use Cases

This model is suitable for various general-purpose instruction-following tasks where a moderately sized yet efficiently trained language model is beneficial. Its optimized training process suggests potential for applications requiring rapid iteration or deployment on resource-constrained environments.