xuyeliu123/swe-agent-lm-7b-num07-swesmith

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 25, 2026License:otherArchitecture:Transformer Cold

The xuyeliu123/swe-agent-lm-7b-num07-swesmith model is a 7.6 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct. It is optimized for code-related tasks, specifically trained on fim_midtrain datasets. This model is designed for applications requiring robust code generation and understanding capabilities within a 32768 token context length.

Loading preview...

Model Overview

The xuyeliu123/swe-agent-lm-7b-num07-swesmith is a 7.6 billion parameter instruction-tuned language model, building upon the foundation of Qwen/Qwen2.5-Coder-7B-Instruct. This model has been specifically fine-tuned to enhance its performance on code-related tasks.

Key Capabilities

  • Code-centric Fine-tuning: The model's training involved specialized datasets including fim_midtrain_v1, fim_midtrain_v2, fim_midtrain_v3_pairs, and fim_midtrain_v3_triples, indicating a focus on Fill-in-the-Middle (FIM) tasks and general code understanding.
  • Instruction Following: As an instruction-tuned model, it is designed to follow user prompts effectively for various programming-related queries.
  • Context Length: Supports a substantial context window of 32768 tokens, beneficial for handling larger codebases or complex programming problems.

Training Details

The model was trained with a learning rate of 1e-05, a total batch size of 128, and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 1.0 epoch. The training leveraged 8 GPUs with a gradient accumulation of 16 steps.

Good for

  • Code Generation: Generating code snippets or completing partial code.
  • Code Understanding: Assisting with code analysis or debugging tasks.
  • Developer Tools: Integration into IDEs or automated coding assistants.