DrRiceIO7/Gemma3-4B-CoT
DrRiceIO7/Gemma3-4B-CoT is a 4.3 billion parameter language model developed by DrRiceIO7, finetuned from unsloth/gemma-3-4b-pt-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training. Licensed under Apache-2.0, its specific primary use case or optimization is not detailed, but it is noted to be for GRPO.
Loading preview...
DrRiceIO7/Gemma3-4B-CoT Overview
This model, developed by DrRiceIO7, is a 4.3 billion parameter language model finetuned from unsloth/gemma-3-4b-pt-unsloth-bnb-4bit. It was trained using the Unsloth library in conjunction with Huggingface's TRL library, which enabled a 2x faster training process. The model is released under the Apache-2.0 license.
Key Characteristics
- Parameter Count: 4.3 billion parameters.
- Base Model: Finetuned from a Gemma 3-4B variant.
- Training Efficiency: Utilizes Unsloth for accelerated training.
- Context Length: Supports a context length of 32768 tokens.
Potential Use Cases
While the README does not specify a primary use case, the developer mentions it will be used for "GRPO," suggesting it might be tailored for a specific internal or research-oriented task. Developers looking for a Gemma-based model with efficient training methods might find this model interesting for further experimentation or domain-specific finetuning.