yueqis/web-qwen-coder-14b-3epochs-25k-5e-5
The yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 model is a 14.8 billion parameter language model, fine-tuned from an earlier version of yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 on a web dataset. This model is designed for code-related tasks, leveraging its Qwen-Coder base architecture. It is optimized for code generation and understanding, making it suitable for developers working with programming challenges.
Loading preview...
Model Overview
This model, yueqis/web-qwen-coder-14b-3epochs-25k-5e-5, is a 14.8 billion parameter language model. It is a fine-tuned iteration of the yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 base model, specifically trained on an additional web dataset. The underlying architecture is based on the Qwen-Coder family, indicating its primary focus on code-related applications.
Key Training Details
The model underwent training with the following hyperparameters:
- Learning Rate: 5e-05
- Batch Size: 1 (train), 8 (eval)
- Gradient Accumulation: 16 steps, resulting in a total effective batch size of 128
- Optimizer: AdamW with specific beta and epsilon values
- Scheduler: Cosine learning rate scheduler with a 0.05 warmup ratio
- Epochs: 1.0
Intended Use Cases
Given its fine-tuning on a web dataset and Qwen-Coder lineage, this model is likely best suited for:
- Code generation tasks
- Code completion and suggestion
- Understanding and analyzing code snippets
- Applications requiring code-centric language processing