Overview

This model, SWE-Lego-Qwen3-4B-posttrain-v2, is a specialized variant of the 4 billion parameter Qwen3-4B architecture. It has undergone fine-tuning on the SWE-Lego dataset, which comprises both real and synthetic resolved trajectories. The training specifically utilized a turn_mask to enhance its performance in sequential decision-making relevant to software engineering tasks.

Key Characteristics

Base Model: Qwen/Qwen3-4B, a 4 billion parameter causal language model.
Specialized Fine-tuning: Trained on the SWE-Lego dataset, focusing on resolved trajectories for software development.
Optimized for Trajectories: Utilizes turn_mask during training, suggesting an emphasis on understanding and generating sequences of actions or steps.

Training Details

The model was trained with a learning rate of 0.0001, a total batch size of 64 (achieved with 8 devices and 8 gradient accumulation steps), and a cosine learning rate scheduler with 0.1 warmup steps over 4 epochs. The optimizer used was AdamW with standard betas and epsilon. This configuration aims to provide robust learning from the specialized SWE-Lego dataset.

Intended Use Cases

While specific intended uses are not detailed in the original model card, the fine-tuning on software engineering trajectories implies suitability for tasks such as:

Automated code generation based on problem descriptions.
Assisting with debugging by suggesting resolution steps.
Generating sequences of actions for software development environments.
Understanding and predicting developer workflows.

Overview

Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)