CodeReview-Qwen32B: Specialized for Code Review
This model, ronantakizawa/codereview-qwen32b, is a 32.8 billion parameter Qwen2.5-Coder-32B-Instruct variant that has been meticulously fine-tuned for the specific task of code review. It leverages a substantial context length of 32768 tokens.
Key Capabilities & Performance
The model demonstrates strong performance in generating code review comments, as evidenced by significant improvements over its base model:
- BLEU-4 Score: Achieved 16.91, a +343% increase from the base model's 3.82.
- ROUGE-L F1 Score: Reached 0.216, marking a +167% improvement from 0.081.
- Comment Type Accuracy: Demonstrated 0.640, a substantial gain from 0.00 in the base model.
Training Details
The fine-tuning process utilized QLoRA SFT on a dataset of 48,000 real code review examples sourced from 504 GitHub repositories, specifically the ronantakizawa/github-codereview dataset. The model is provided in bf16 safetensors format, distributed across 14 shards, totaling approximately 64GB.
Ideal Use Cases
This model is particularly well-suited for:
- Automated Code Review: Generating constructive feedback and suggestions for code changes.
- Developer Tooling: Integrating into IDEs or CI/CD pipelines to assist developers with code quality and best practices.
- Educational Platforms: Providing automated explanations and improvements for student code submissions.