KKHYA/qwen3-14b-fft-coding

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Apr 30, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The KKHYA/qwen3-14b-fft-coding model is a 14 billion parameter language model, fine-tuned from Qwen/Qwen3-14B, specifically optimized for code generation and understanding. It leverages a 32768 token context length and was trained on specialized coding datasets including mft_tulu3_personas_code, mft_evol_codealpaca, and mft_codefeedback. This model is designed to excel in various coding-related tasks, making it suitable for developers and applications requiring robust code intelligence.

Loading preview...

Overview

The KKHYA/qwen3-14b-fft-coding is a 14 billion parameter model, fine-tuned from the Qwen3-14B base architecture. Its primary focus is on enhancing code generation and comprehension capabilities through specialized training. The model was trained using a learning rate of 1e-05, a total batch size of 128, and a cosine learning rate scheduler over 4 epochs.

Key Capabilities

  • Code Generation: Optimized for generating various programming language constructs and solutions.
  • Code Understanding: Improved ability to interpret and process code-related queries.
  • Context Handling: Benefits from a substantial 32768 token context window, allowing for processing larger codebases or complex problem descriptions.

Training Details

The model underwent fine-tuning on a combination of coding-centric datasets:

  • mft_tulu3_personas_code
  • mft_evol_codealpaca
  • mft_codefeedback

These datasets contribute to its specialized performance in coding tasks. The training utilized an AdamW optimizer with specific beta and epsilon parameters, and a distributed setup across 8 GPUs.

Good For

  • Developers seeking an LLM for code completion, generation, or debugging assistance.
  • Applications requiring robust code intelligence and understanding.
  • Tasks involving processing and generating code within a large contextual window.