xuyeliu123/swe-agent-lm-7b-num07-swesmith
The xuyeliu123/swe-agent-lm-7b-num07-swesmith model is a 7.6 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct. It is optimized for code-related tasks, specifically trained on fim_midtrain datasets. This model is designed for applications requiring robust code generation and understanding capabilities within a 32768 token context length.
Loading preview...
Model Overview
The xuyeliu123/swe-agent-lm-7b-num07-swesmith is a 7.6 billion parameter instruction-tuned language model, building upon the foundation of Qwen/Qwen2.5-Coder-7B-Instruct. This model has been specifically fine-tuned to enhance its performance on code-related tasks.
Key Capabilities
- Code-centric Fine-tuning: The model's training involved specialized datasets including
fim_midtrain_v1,fim_midtrain_v2,fim_midtrain_v3_pairs, andfim_midtrain_v3_triples, indicating a focus on Fill-in-the-Middle (FIM) tasks and general code understanding. - Instruction Following: As an instruction-tuned model, it is designed to follow user prompts effectively for various programming-related queries.
- Context Length: Supports a substantial context window of 32768 tokens, beneficial for handling larger codebases or complex programming problems.
Training Details
The model was trained with a learning rate of 1e-05, a total batch size of 128, and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 1.0 epoch. The training leveraged 8 GPUs with a gradient accumulation of 16 steps.
Good for
- Code Generation: Generating code snippets or completing partial code.
- Code Understanding: Assisting with code analysis or debugging tasks.
- Developer Tools: Integration into IDEs or automated coding assistants.