Overview
thetmon/c21 is a LoRA adapter (r=64, alpha=128) designed to enhance the capabilities of the Qwen/Qwen3-4B-Instruct-2507 base model. This adapter, trained using LoRA with Unsloth, focuses on improving the base model's performance in complex, multi-turn agent tasks.
Key Capabilities
- Multi-turn Agent Task Performance: Specifically fine-tuned to excel in scenarios requiring sequential decision-making and interaction.
- ALFWorld Optimization: Demonstrates improved performance on household task environments, involving planning and execution.
- DBBench Optimization: Enhances capabilities in database operation tasks, including query generation and interaction.
- Error Recovery: The training objective, applying loss to all assistant turns, helps the model learn to recover from errors and adapt within multi-turn trajectories.
Training Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision base)
- Max Sequence Length: 4096 tokens
- Epochs: 3
- Learning Rate: 2e-04
- Training Data: Utilizes a mix of
u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4 datasets, both licensed under MIT.
When to Use
This adapter is ideal for applications requiring a 4B parameter model with enhanced agentic capabilities, particularly for tasks involving sequential decision-making, tool use, and interaction within structured environments like household simulations or database interfaces. Users must load the base model separately and comply with the MIT license for the datasets and the base model's original terms.