thetmon/c9: Qwen3-4B Agent Task LoRA Adapter
This repository provides a LoRA adapter for the Qwen/Qwen3-4B-Instruct-2507 base model, specifically fine-tuned to improve its performance on multi-turn agent tasks. Unlike full model fine-tunes, this adapter contains only the LoRA weights, requiring the base model to be loaded separately.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Optimized for complex, multi-step interactions.
- Specialized Task Domains: Excels in household tasks (ALFWorld) and database operations (DBBench).
- Comprehensive Agent Learning: Trained to learn environment observation, action selection, tool use, and error recovery within multi-turn trajectories.
Training Details
The adapter was trained using LoRA (r=64, alpha=128) on a full precision base model, with a maximum sequence length of 4096 over 3 epochs. The training objective focused on applying loss to all assistant turns, ensuring robust learning across the entire interaction. The training data includes u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4, both distributed under the MIT License.
Good For
- Developers building agents that require robust multi-turn interaction capabilities.
- Applications involving automated household tasks or database management through natural language.
- Extending the Qwen3-4B-Instruct model's agentic reasoning without full model fine-tuning.