KKHYA/qwen3-1.7b-fft-coding
KKHYA/qwen3-1.7b-fft-coding is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B, specifically optimized for coding tasks. This model leverages a 32768 token context length and is trained on specialized datasets including mft_tulu3_personas_code, mft_evol_codealpaca, and mft_codefeedback. It is designed to excel in code generation and understanding, making it suitable for various programming-related applications.
Loading preview...
Model Overview
KKHYA/qwen3-1.7b-fft-coding is a 2 billion parameter language model, fine-tuned from the Qwen3-1.7B architecture. This model has been specifically adapted for coding tasks through extensive training on a curated set of code-centric datasets.
Key Capabilities
- Code Generation: Optimized for generating various programming language constructs and solutions.
- Code Understanding: Enhanced ability to interpret and process code-related prompts.
- Extended Context: Features a substantial 32768 token context window, beneficial for handling larger codebases or complex programming problems.
Training Details
The model was fine-tuned using a learning rate of 1e-05, a total batch size of 128, and a cosine learning rate scheduler over 4 epochs. The training incorporated an AdamW optimizer with specific beta and epsilon parameters. The training utilized a multi-GPU setup with 8 devices.
Good For
- Code Assistants: Developing tools that assist developers with writing or debugging code.
- Automated Code Generation: Tasks requiring the creation of code snippets or functions based on natural language descriptions.
- Educational Tools: Applications for teaching programming concepts or providing coding exercises.