yueqis/web-qwen-coder-14b-3epochs-25k-5e-5

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Oct 24, 2025License:otherArchitecture:Transformer0.0K Cold

The yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 model is a 14.8 billion parameter language model, fine-tuned from an earlier version of yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 on a web dataset. This model is designed for code-related tasks, leveraging its Qwen-Coder base architecture. It is optimized for code generation and understanding, making it suitable for developers working with programming challenges.

Loading preview...

Model Overview

This model, yueqis/web-qwen-coder-14b-3epochs-25k-5e-5, is a 14.8 billion parameter language model. It is a fine-tuned iteration of the yueqis/web-qwen-coder-14b-3epochs-25k-5e-5 base model, specifically trained on an additional web dataset. The underlying architecture is based on the Qwen-Coder family, indicating its primary focus on code-related applications.

Key Training Details

The model underwent training with the following hyperparameters:

  • Learning Rate: 5e-05
  • Batch Size: 1 (train), 8 (eval)
  • Gradient Accumulation: 16 steps, resulting in a total effective batch size of 128
  • Optimizer: AdamW with specific beta and epsilon values
  • Scheduler: Cosine learning rate scheduler with a 0.05 warmup ratio
  • Epochs: 1.0

Intended Use Cases

Given its fine-tuning on a web dataset and Qwen-Coder lineage, this model is likely best suited for:

  • Code generation tasks
  • Code completion and suggestion
  • Understanding and analyzing code snippets
  • Applications requiring code-centric language processing