dwt012/vit2sql-q-grpo
dwt012/vit2sql-q-grpo is a 7.6 billion parameter Qwen2-based model developed by dwt012, fine-tuned from unsloth/qwen2.5-coder-7b-instruct-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training. It is optimized for specific tasks related to its base model's coding and instruction-following capabilities, offering a context length of 32768 tokens.
Loading preview...
Model Overview
dwt012/vit2sql-q-grpo is a 7.6 billion parameter language model developed by dwt012. It is fine-tuned from the unsloth/qwen2.5-coder-7b-instruct-bnb-4bit base model, leveraging the Qwen2 architecture. This model was specifically trained using the Unsloth library, which enabled a 2x faster training process, in conjunction with Huggingface's TRL library.
Key Characteristics
- Base Architecture: Qwen2
- Parameter Count: 7.6 billion parameters
- Training Efficiency: Achieved 2x faster training through the use of Unsloth.
- Context Length: Supports a context window of 32768 tokens.
Good For
- Applications requiring a Qwen2-based model with optimized training.
- Tasks benefiting from a model fine-tuned from
unsloth/qwen2.5-coder-7b-instruct-bnb-4bit.