KKHYA/qwen3-14b-fft-coding
The KKHYA/qwen3-14b-fft-coding model is a 14 billion parameter language model, fine-tuned from Qwen/Qwen3-14B, specifically optimized for code generation and understanding. It leverages a 32768 token context length and was trained on specialized coding datasets including mft_tulu3_personas_code, mft_evol_codealpaca, and mft_codefeedback. This model is designed to excel in various coding-related tasks, making it suitable for developers and applications requiring robust code intelligence.
Loading preview...
Overview
The KKHYA/qwen3-14b-fft-coding is a 14 billion parameter model, fine-tuned from the Qwen3-14B base architecture. Its primary focus is on enhancing code generation and comprehension capabilities through specialized training. The model was trained using a learning rate of 1e-05, a total batch size of 128, and a cosine learning rate scheduler over 4 epochs.
Key Capabilities
- Code Generation: Optimized for generating various programming language constructs and solutions.
- Code Understanding: Improved ability to interpret and process code-related queries.
- Context Handling: Benefits from a substantial 32768 token context window, allowing for processing larger codebases or complex problem descriptions.
Training Details
The model underwent fine-tuning on a combination of coding-centric datasets:
mft_tulu3_personas_codemft_evol_codealpacamft_codefeedback
These datasets contribute to its specialized performance in coding tasks. The training utilized an AdamW optimizer with specific beta and epsilon parameters, and a distributed setup across 8 GPUs.
Good For
- Developers seeking an LLM for code completion, generation, or debugging assistance.
- Applications requiring robust code intelligence and understanding.
- Tasks involving processing and generating code within a large contextual window.