Qwen3-4B ALFWorld+DBBench Mixed LoRA Adapter
This repository provides a LoRA adapter (r=64) fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model using LoRA and Unsloth. It's important to note that this repository contains only the LoRA adapter weights, and the base model must be loaded separately.
Key Capabilities & Training Objective
The primary objective of this adapter is to significantly improve multi-turn agent task performance. It has been specifically trained on a mix of ALFWorld (household tasks) and DBBench (database operations) datasets. The training methodology applies loss to all assistant turns within a multi-turn trajectory, which enables the model to effectively learn:
- Environment observation
- Intelligent action selection
- Proficient tool use
- Robust error recovery mechanisms
This makes the adapter particularly adept at handling complex, sequential agentic workflows.
Training Configuration Highlights
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision base)
- Max Sequence Length: 4096 tokens
- LoRA Parameters: r=64, alpha=128
Usage Considerations
Users must comply with the MIT License for the training datasets (u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4) and the original terms of use for the Qwen3-4B-Instruct-2507 base model.