laion/SweSmith-8B-SFT-NoRope-step58
The laion/SweSmith-8B-SFT-NoRope-step58 model is an 8 billion parameter Qwen3-based language model, fine-tuned using Reinforcement Learning with Leave-One-Out baselines (RLOO-N) on 2,500 oracle-verified SWEsmith tasks. It features a 32,768 token context length and is specifically optimized for software engineering tasks, demonstrating improved performance over its SFT base model on SWE-bench 100 and dev_set_71 benchmarks. This model is designed for automated code generation and bug fixing within a software development context.
Loading preview...
Model Overview
laion/SweSmith-8B-SFT-NoRope-step58 is an 8 billion parameter language model built upon the Qwen3 architecture. It has been specifically fine-tuned using an advanced Reinforcement Learning (RL) method called RLOO-N (Reinforcement Learning with Leave-One-Out baselines) to excel in software engineering tasks. The model maintains a substantial context window of 32,768 tokens and does not utilize rope scaling.
Key Capabilities
- Software Engineering Task Performance: Demonstrates enhanced capabilities in automated software development tasks, including code generation and bug fixing.
- Benchmark Improvement: Outperforms its Supervised Fine-Tuning (SFT) base model on critical software engineering benchmarks:
- Achieves a pass@1 score of 0.227 on
dev_set_71, surpassing the base model's 0.213. - Attains a score of 0.220 on
SWE-bench 100, an improvement over the base model's 0.210.
- Achieves a pass@1 score of 0.227 on
- RL-Trained: Benefits from Reinforcement Learning on a dataset of 2,500 oracle-verified SWEsmith tasks, ensuring robust performance in practical scenarios.
Good For
- Automated Code Generation: Ideal for generating code snippets or entire functions based on given specifications.
- Bug Fixing and Code Refinement: Suitable for identifying and suggesting fixes for software bugs.
- Software Development Workflows: Can be integrated into tools and systems requiring automated assistance for software engineering challenges.