laion/nl2bash-swesmith-stack-bugsseq

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 12, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/nl2bash-swesmith-stack-bugsseq model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. Developed by laion, this model specializes in natural language to bash command translation and reasoning tasks. It was trained on a combination of specialized datasets including GLM-4.6-nl2bash-verified, GLM-4.6-swesmith, GLM-4.6-stackexchange-overflow-sandboxes, and GLM-4.6-inferredbugs, making it suitable for complex command generation and bug inference scenarios. Its 32768 token context length supports processing extensive input for these specialized applications.

Loading preview...

Model Overview

This model, laion/nl2bash-swesmith-stack-bugsseq, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned by laion to excel in tasks related to natural language to bash command translation and reasoning.

Key Capabilities

  • Natural Language to Bash Translation: Optimized for converting natural language instructions into executable bash commands.
  • Reasoning Tasks: Enhanced for reasoning, particularly within the context of code and command generation.
  • Specialized Dataset Training: Fine-tuned on a unique combination of datasets:
    • penfever/GLM-4.6-nl2bash-verified-32ep-32k-reasoning
    • penfever/GLM-4.6-swesmith-32ep-131k-nosumm-reasoning
    • penfever/GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasoning
    • penfever/GLM-4.6-inferredbugs-32ep-65k-reasoning

Training Details

The model was trained with a learning rate of 4e-05, using a cosine learning rate scheduler with a 0.1 warmup ratio over 7 epochs. It utilized a total batch size of 16 across 8 GPUs, employing the AdamW_Torch_Fused optimizer. The training environment included Transformers 4.56.1 and Pytorch 2.9.1+cu128.

Intended Use Cases

This model is particularly well-suited for applications requiring the generation of bash commands from natural language queries, as well as tasks involving code-related reasoning and bug inference, leveraging its specialized training on relevant datasets.