Thetmon/c16: LoRA Adapter for Enhanced Agentic Performance
The thetmon/c16 model is a LoRA adapter (r=64, alpha=128) built upon the Qwen/Qwen3-4B-Instruct-2507 base model. It was fine-tuned using LoRA and Unsloth to significantly improve performance on complex, multi-turn agent tasks.
Key Capabilities
- Multi-turn Agent Task Performance: Specifically optimized for scenarios requiring sequential decision-making and interaction.
- ALFWorld Proficiency: Demonstrates enhanced ability in household task environments, involving planning and execution.
- DBBench Operations: Improved performance in database-related tasks, including query generation and manipulation.
- Error Recovery: The training objective focused on applying loss to all assistant turns, enabling the model to learn from and recover from errors within multi-turn trajectories.
- Tool Use and Action Selection: Designed to better understand environment observations and make appropriate action and tool selections.
Training Details
This adapter was trained for 3 epochs with a learning rate of 2e-04 and a maximum sequence length of 4096. The training data included u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4, both distributed under the MIT License.
Good For
- Developing AI agents that require robust multi-turn interaction capabilities.
- Applications involving complex task execution in simulated environments like ALFWorld.
- Database interaction and automation requiring intelligent agent behavior.