laion/swesmith-nl2bash-stack-bugsseq
The laion/swesmith-nl2bash-stack-bugsseq model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It is specialized for natural language to bash command translation, code reasoning, and bug detection, leveraging diverse datasets including GLM-4.6-nl2bash-verified and GLM-4.6-inferredbugs. This model is designed to enhance performance in understanding and generating shell commands from natural language queries and identifying code issues.
Loading preview...
Model Overview
The laion/swesmith-nl2bash-stack-bugsseq is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model has been specifically trained on a combination of specialized datasets to enhance its capabilities in several key areas.
Key Capabilities
- Natural Language to Bash Translation: Fine-tuned on
penfever/GLM-4.6-nl2bash-verified-32ep-32k-reasoning, indicating a strong focus on converting natural language instructions into executable bash commands. - Code Reasoning: Training on
penfever/GLM-4.6-swesmith-32ep-131k-nosumm-reasoningandpenfever/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoningsuggests proficiency in understanding and processing code-related logic and discussions. - Bug Detection and Inference: The inclusion of
penfever/GLM-4.6-inferredbugs-32ep-65k-reasoningdataset points to an ability to identify and reason about potential bugs or errors in code.
Training Details
The model was trained with a learning rate of 4e-05, using an AdamW optimizer with specific beta and epsilon values. It underwent 7 epochs of training with a total batch size of 16 across 8 GPUs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio. The training leveraged Transformers 4.57.3 and PyTorch 2.9.0+cu128.