AgPerry/SWE-Lego-Qwen3-4B-posttrain-v2

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 17, 2026License:otherArchitecture:Transformer Cold

AgPerry/SWE-Lego-Qwen3-4B-posttrain-v2 is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B. This model is specifically optimized for software engineering tasks, leveraging real and synthetic resolved trajectories from the SWE-Lego dataset. Its primary differentiation lies in its specialized training for code-related problem-solving and trajectory generation, making it suitable for automated software development workflows.

Loading preview...

Overview

This model, SWE-Lego-Qwen3-4B-posttrain-v2, is a specialized variant of the 4 billion parameter Qwen3-4B architecture. It has undergone fine-tuning on the SWE-Lego dataset, which comprises both real and synthetic resolved trajectories. The training specifically utilized a turn_mask to enhance its performance in sequential decision-making relevant to software engineering tasks.

Key Characteristics

  • Base Model: Qwen/Qwen3-4B, a 4 billion parameter causal language model.
  • Specialized Fine-tuning: Trained on the SWE-Lego dataset, focusing on resolved trajectories for software development.
  • Optimized for Trajectories: Utilizes turn_mask during training, suggesting an emphasis on understanding and generating sequences of actions or steps.

Training Details

The model was trained with a learning rate of 0.0001, a total batch size of 64 (achieved with 8 devices and 8 gradient accumulation steps), and a cosine learning rate scheduler with 0.1 warmup steps over 4 epochs. The optimizer used was AdamW with standard betas and epsilon. This configuration aims to provide robust learning from the specialized SWE-Lego dataset.

Intended Use Cases

While specific intended uses are not detailed in the original model card, the fine-tuning on software engineering trajectories implies suitability for tasks such as:

  • Automated code generation based on problem descriptions.
  • Assisting with debugging by suggesting resolution steps.
  • Generating sequences of actions for software development environments.
  • Understanding and predicting developer workflows.