Overview
This repository provides a LoRA adapter (r=16) fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model. It is specifically designed to improve the base model's performance in complex, multi-turn agent tasks.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: The adapter is trained to improve the model's ability to handle sequential, interactive tasks.
- Task Specialization: Optimized for two distinct domains:
- ALFWorld: Household task execution, involving understanding environments and performing actions.
- DBBench: Database operations, likely including query generation, execution, and result interpretation.
- Comprehensive Learning: The training objective applies loss to all assistant turns, enabling the model to learn:
- Environment observation
- Action selection
- Tool use
- Error recovery mechanisms
Training Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (Low-Rank Adaptation) with full precision base model
- Max Sequence Length: 2048 tokens
- Dataset: Trained on the u-10bei/sft_alfworld_trajectory_dataset_v5, which is licensed under MIT.
Usage Notes
This repository contains only the LoRA adapter weights. Users must load the base model (Qwen/Qwen3-4B-Instruct-2507) separately and then merge the adapter for full functionality.