AgPerry/Qwen2.5-Coder-14B-Instruct-num11_v1-v2-v3-pairs-v3-triples
AgPerry/Qwen2.5-Coder-14B-Instruct-num11 is a 14.8 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-Coder-14B-Instruct. This model specializes in code generation and understanding, having been trained on multiple datasets focused on fill-in-the-middle (FIM) tasks. It is designed for developers requiring a robust model for programming-related applications, offering a 32768 token context length.
Loading preview...
Overview
AgPerry/Qwen2.5-Coder-14B-Instruct-num11 is a 14.8 billion parameter instruction-tuned language model, building upon the base of Qwen/Qwen2.5-Coder-14B-Instruct. This model has undergone further fine-tuning across several specialized datasets, including fim_midtrain_v1, fim_midtrain_v2, fim_midtrain_v3_multi_pairs, fim_midtrain_v3_multi_pairs_0317, fim_midtrain_v3_multi_triples, and fim_midtrain_v3_multi_triples_0317. These datasets are typically used for fill-in-the-middle (FIM) tasks, indicating a strong focus on code completion and generation capabilities.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-Coder-14B-Instruct.
- Parameter Count: 14.8 billion parameters.
- Context Length: Supports a context window of 32768 tokens.
- Specialization: Enhanced for code-related tasks, particularly those involving fill-in-the-middle scenarios, through targeted fine-tuning.
Training Details
The model was trained with a learning rate of 1e-05, a total batch size of 128 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 16 across 8 GPUs), and utilized the AdamW optimizer with a cosine learning rate scheduler. Training was conducted for 1 epoch.
Intended Use Cases
This model is primarily intended for applications requiring advanced code generation, completion, and understanding. Its fine-tuning on FIM datasets suggests strong performance in scenarios where code needs to be intelligently filled in or completed based on surrounding context.