Overview
Model Overview
Creekside/Qwen-3B-gsm8k-GRPO is a 3.1 billion parameter language model developed by Creekside. It is fine-tuned from the unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit base model, leveraging the Qwen2 architecture. A key characteristic of this model is its optimized training process, which was achieved 2x faster by utilizing Unsloth and Huggingface's TRL library.
Key Capabilities
- Efficient Training: Benefits from Unsloth's optimizations for faster training, making it a potentially cost-effective option for fine-tuning.
- Qwen2 Architecture: Inherits the robust capabilities of the Qwen2 model family.
- Extended Context: Features a context length of 32768 tokens, suitable for processing longer inputs and maintaining conversational coherence over extended interactions.
Good For
- Developers seeking a Qwen2-based model with a focus on training efficiency.
- Applications requiring a substantial context window for complex tasks.
- Use cases where rapid iteration and fine-tuning are important considerations.