Qwen3-4B ALFWorld+DBBench Mixed LoRA Adapter

This repository provides a LoRA adapter (r=64) fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model using LoRA and Unsloth. It's important to note that this repository contains only the LoRA adapter weights, and the base model must be loaded separately.

Key Capabilities & Training Objective

The primary objective of this adapter is to significantly improve multi-turn agent task performance. It has been specifically trained on a mix of ALFWorld (household tasks) and DBBench (database operations) datasets. The training methodology applies loss to all assistant turns within a multi-turn trajectory, which enables the model to effectively learn:

Environment observation
Intelligent action selection
Proficient tool use
Robust error recovery mechanisms

This makes the adapter particularly adept at handling complex, sequential agentic workflows.

Training Configuration Highlights

Base Model: Qwen/Qwen3-4B-Instruct-2507
Method: LoRA (full precision base)
Max Sequence Length: 4096 tokens
LoRA Parameters: r=64, alpha=128

Usage Considerations

Users must comply with the MIT License for the training datasets (u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4) and the original terms of use for the Qwen3-4B-Instruct-2507 base model.

Overview

Qwen3-4B ALFWorld+DBBench Mixed LoRA Adapter

Key Capabilities & Training Objective

Training Configuration Highlights

Usage Considerations

Full Model Card (README)