yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5
The yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5 model is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct. This model is specifically optimized for code generation and understanding tasks, having been fine-tuned on the 'swe_only' dataset. It is designed for applications requiring robust code-related capabilities.
Loading preview...
Overview
This model, yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5, is a specialized version of the Qwen2.5-Coder-7B-Instruct architecture, featuring 7.6 billion parameters. It has undergone further fine-tuning specifically on the swe_only dataset, indicating a focus on particular software engineering tasks or code domains.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct.
- Parameter Count: 7.6 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
- Training Focus: Specialized fine-tuning on the
swe_onlydataset, suggesting enhanced performance for specific code-related applications.
Training Details
The model was trained with a learning rate of 5e-05 over 3 epochs, utilizing a total batch size of 512 across 32 GPUs. The optimizer used was adamw_torch with standard betas and epsilon, and a cosine learning rate scheduler with a 0.05 warmup ratio.
Intended Use Cases
Given its fine-tuning on a code-specific dataset, this model is likely best suited for:
- Code generation.
- Code completion.
- Code understanding and analysis within the domain covered by the
swe_onlydataset.