thetmon/c9
The thetmon/c9 is a 4 billion parameter LoRA adapter, fine-tuned from Qwen/Qwen3-4B-Instruct-2507, designed to enhance multi-turn agent task performance. It specializes in household tasks (ALFWorld) and database operations (DBBench) by learning environment observation, action selection, and tool use. This adapter improves the base model's ability to handle complex, multi-step agent trajectories.
Loading preview...
thetmon/c9: Qwen3-4B Agent Task LoRA Adapter
This repository provides a LoRA adapter for the Qwen/Qwen3-4B-Instruct-2507 base model, specifically fine-tuned to improve its performance on multi-turn agent tasks. Unlike full model fine-tunes, this adapter contains only the LoRA weights, requiring the base model to be loaded separately.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Optimized for complex, multi-step interactions.
- Specialized Task Domains: Excels in household tasks (ALFWorld) and database operations (DBBench).
- Comprehensive Agent Learning: Trained to learn environment observation, action selection, tool use, and error recovery within multi-turn trajectories.
Training Details
The adapter was trained using LoRA (r=64, alpha=128) on a full precision base model, with a maximum sequence length of 4096 over 3 epochs. The training objective focused on applying loss to all assistant turns, ensuring robust learning across the entire interaction. The training data includes u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4, both distributed under the MIT License.
Good For
- Developers building agents that require robust multi-turn interaction capabilities.
- Applications involving automated household tasks or database management through natural language.
- Extending the Qwen3-4B-Instruct model's agentic reasoning without full model fine-tuning.