bknyaz/Qwen3-0.6B-Code
The bknyaz/Qwen3-0.6B-Code is a 0.8 billion parameter language model, fine-tuned from Qwen/Qwen3-0.6B, specifically optimized for code generation tasks. It leverages the evol-codealpaca-v1 dataset to enhance its programming capabilities. With a context length of 32768 tokens, this model demonstrates improved performance on code-related benchmarks like HumanEval, making it suitable for various code-centric applications.
Loading preview...
Overview
This model, bknyaz/Qwen3-0.6B-Code, is a fine-tuned version of the Qwen/Qwen3-0.6B base model, specifically adapted for code generation. It was trained using the evol-codealpaca-v1 dataset, focusing on improving its ability to understand and generate code. The fine-tuning process utilized the TRL library with SFT/full-rank options, and the model maintains a substantial context length of 32768 tokens.
Key Capabilities
- Enhanced Code Generation: Fine-tuned on a specialized code dataset to improve programming task performance.
- HumanEval Benchmark Improvement: Achieved a score of 46.3 on the HumanEval (instruct) benchmark, a notable increase from the base Qwen3-0.6B's 38.4.
- Conversational Format Preprocessing: The training data was preprocessed into a conversational format, which can be beneficial for instruction-following in code generation.
Use Cases
- Code Completion and Generation: Ideal for tasks requiring the generation of code snippets or completing existing code.
- Educational Tools: Can be integrated into platforms for learning and practicing programming.
- Developer Assistance: Useful for developers seeking assistance with coding challenges or generating boilerplate code.
Training and Evaluation Details
The model was fine-tuned on a single A100 GPU. Evaluation was conducted using lm_eval on the humaneval_instruct benchmark, demonstrating its specialized proficiency in coding tasks.