Creekside/Qwen-3B-gsm8k-GRPO

Warm
Public
3.1B
BF16
32768
Mar 9, 2025
License: apache-2.0
Hugging Face
Overview

Model Overview

Creekside/Qwen-3B-gsm8k-GRPO is a 3.1 billion parameter language model developed by Creekside. It is fine-tuned from the unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit base model, leveraging the Qwen2 architecture. A key characteristic of this model is its optimized training process, which was achieved 2x faster by utilizing Unsloth and Huggingface's TRL library.

Key Capabilities

  • Efficient Training: Benefits from Unsloth's optimizations for faster training, making it a potentially cost-effective option for fine-tuning.
  • Qwen2 Architecture: Inherits the robust capabilities of the Qwen2 model family.
  • Extended Context: Features a context length of 32768 tokens, suitable for processing longer inputs and maintaining conversational coherence over extended interactions.

Good For

  • Developers seeking a Qwen2-based model with a focus on training efficiency.
  • Applications requiring a substantial context window for complex tasks.
  • Use cases where rapid iteration and fine-tuning are important considerations.