princeton-nlp/SWE-Llama-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Oct 10, 2023Architecture:Transformer0.0K Cold

SWE-Llama-13b is a 13 billion parameter Transformer model developed by princeton-nlp, fine-tuned from CodeLlama. It specializes in software engineering tasks, specifically generating code patches to resolve GitHub issues based on issue descriptions and code context. This model is optimized for automated bug fixing and software development workflows, leveraging real-world GitHub data for its training.

Loading preview...

SWE-Llama-13b: Fine-tuned for Software Engineering Tasks

SWE-Llama-13b is a 13 billion parameter model from princeton-nlp, built upon the CodeLlama architecture. It is specifically fine-tuned for software engineering tasks, with a primary objective of generating code patches to resolve real-world GitHub issues. The model's training data consists of 19,000 issues and pull requests collected from 37 popular Python code repositories on GitHub, distinct from the SWE-bench evaluation set.

Key Capabilities

  • Automated Issue Resolution: Designed to generate code fixes for GitHub issues, conditioned on the issue description and relevant code context.
  • Code Patch Generation: Focuses on producing executable code changes to address identified software bugs or feature requests.
  • Specialized Training: Fine-tuned using the LoRA method over 4 epochs on a dataset of real-world software engineering problems.

Performance

On the SWE-bench benchmark, SWE-Llama-13b achieved a 4.0% issue resolution rate using oracle context retrieval, demonstrating its capability in automated software repair.

Good For

  • Developers and researchers working on automated bug fixing.
  • Integrating AI into software development pipelines for issue resolution.
  • Tasks requiring code generation in response to natural language problem descriptions within a software engineering context.