Salesforce/SweRankLLM-Small

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 24, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

SweRankLLM-Small is a 7.6 billion parameter language model developed by Salesforce, based on Qwen2.5-Coder-7B-Instruct, with a 131072 token context length. It is specifically fine-tuned for listwise code-reranking, significantly enhancing the quality of results for software issue localization. This model is designed to be combined with performant code retrievers like SweRankEmbed to improve search and ranking in software development contexts.

Loading preview...

SweRankLLM-Small Overview

SweRankLLM-Small is a 7.6 billion parameter language model developed by Salesforce, built upon the Qwen2.5-Coder-7B-Instruct architecture. Its primary distinction lies in its specialized fine-tuning for listwise code-reranking, a critical task in software issue localization. The model boasts a substantial context length of 131072 tokens, allowing it to process extensive code snippets and related information.

Key Capabilities

  • Code Reranking: Optimized to re-rank lists of code, improving the relevance and quality of results for software issue localization.
  • Enhanced Issue Localization: When integrated with effective code retrievers (e.g., SweRankEmbed), it significantly boosts the accuracy and utility of identifying relevant code for software issues.
  • Specialized Training: Trained on large-scale issue localization data derived from public Python GitHub repositories, ensuring its proficiency in real-world software development scenarios.

Use Cases

  • Software Issue Localization: Ideal for developers and researchers working on systems that need to pinpoint specific code sections related to reported software issues.
  • Code Search and Ranking: Can be employed in tools that require intelligent re-ordering of code search results to present the most pertinent information first.
  • Research in Code Intelligence: Useful for academic and industrial research focusing on improving code understanding, retrieval, and ranking mechanisms. More details are available in the associated blog post and research paper.