xuyeliu123/swe-agent-lm-7b-swesmith
The xuyeliu123/swe-agent-lm-7b-swesmith is an instruction-tuned 7.61 billion parameter Qwen2.5-Coder model, developed by Alibaba Cloud, specifically optimized for code generation, reasoning, and fixing. This causal language model, part of the Qwen2.5-Coder series, features a transformer architecture and supports a full context length of 131,072 tokens, making it highly effective for complex coding tasks and real-world applications like Code Agents.
Loading preview...
Overview
xuyeliu123/swe-agent-lm-7b-swesmith is an instruction-tuned 7.61 billion parameter model from the Qwen2.5-Coder series, developed by Alibaba Cloud. This model is a specialized version of the Qwen2.5 large language models, focusing on code-related tasks. It builds upon the CodeQwen1.5 foundation, incorporating significant improvements in its coding capabilities.
Key Capabilities
- Enhanced Code Performance: Demonstrates substantial improvements in code generation, code reasoning, and code fixing. The larger 32B variant in this series is noted for matching GPT-4o's coding abilities.
- Extensive Training: Trained on 5.5 trillion tokens, including a rich mix of source code, text-code grounding, and synthetic data.
- Long-Context Support: Features a full context length of 131,072 tokens, utilizing techniques like YaRN for efficient handling of long texts, though the default
config.jsonis set for 32,768 tokens. - General Competencies: While specialized for code, it maintains strong performance in mathematics and general language understanding, making it suitable for comprehensive applications like Code Agents.
- Architecture: Built on a transformer architecture, incorporating RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
Good For
- Code Generation and Refinement: Ideal for developers needing robust code generation, debugging, and reasoning functionalities.
- Code Agent Development: Provides a strong foundation for building sophisticated code agents due to its enhanced coding and general competencies.
- Long Codebase Analysis: Its extensive context window makes it suitable for processing and understanding large codebases or complex programming problems.
- Research and Development: Useful for researchers exploring advanced code-specific LLM applications and performance benchmarks.