AgPerry/SWE-Lego-Qwen3-4B-posttrain-v2
AgPerry/SWE-Lego-Qwen3-4B-posttrain-v2 is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B. This model is specifically optimized for software engineering tasks, leveraging real and synthetic resolved trajectories from the SWE-Lego dataset. Its primary differentiation lies in its specialized training for code-related problem-solving and trajectory generation, making it suitable for automated software development workflows.
Loading preview...
Overview
This model, SWE-Lego-Qwen3-4B-posttrain-v2, is a specialized variant of the 4 billion parameter Qwen3-4B architecture. It has undergone fine-tuning on the SWE-Lego dataset, which comprises both real and synthetic resolved trajectories. The training specifically utilized a turn_mask to enhance its performance in sequential decision-making relevant to software engineering tasks.
Key Characteristics
- Base Model: Qwen/Qwen3-4B, a 4 billion parameter causal language model.
- Specialized Fine-tuning: Trained on the SWE-Lego dataset, focusing on resolved trajectories for software development.
- Optimized for Trajectories: Utilizes
turn_maskduring training, suggesting an emphasis on understanding and generating sequences of actions or steps.
Training Details
The model was trained with a learning rate of 0.0001, a total batch size of 64 (achieved with 8 devices and 8 gradient accumulation steps), and a cosine learning rate scheduler with 0.1 warmup steps over 4 epochs. The optimizer used was AdamW with standard betas and epsilon. This configuration aims to provide robust learning from the specialized SWE-Lego dataset.
Intended Use Cases
While specific intended uses are not detailed in the original model card, the fine-tuning on software engineering trajectories implies suitability for tasks such as:
- Automated code generation based on problem descriptions.
- Assisting with debugging by suggesting resolution steps.
- Generating sequences of actions for software development environments.
- Understanding and predicting developer workflows.