yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Oct 10, 2025License:otherArchitecture:Transformer Cold

The yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5 model is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct. This model is specifically optimized for code generation and understanding tasks, having been fine-tuned on the 'swe_only' dataset. It is designed for applications requiring robust code-related capabilities.

Loading preview...

Overview

This model, yueqis/swe_only-qwen-coder-7b-3epochs-30k-5e-5, is a specialized version of the Qwen2.5-Coder-7B-Instruct architecture, featuring 7.6 billion parameters. It has undergone further fine-tuning specifically on the swe_only dataset, indicating a focus on particular software engineering tasks or code domains.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct.
  • Parameter Count: 7.6 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Training Focus: Specialized fine-tuning on the swe_only dataset, suggesting enhanced performance for specific code-related applications.

Training Details

The model was trained with a learning rate of 5e-05 over 3 epochs, utilizing a total batch size of 512 across 32 GPUs. The optimizer used was adamw_torch with standard betas and epsilon, and a cosine learning rate scheduler with a 0.05 warmup ratio.

Intended Use Cases

Given its fine-tuning on a code-specific dataset, this model is likely best suited for:

  • Code generation.
  • Code completion.
  • Code understanding and analysis within the domain covered by the swe_only dataset.