Overview
The thetmon/c20 is a LoRA adapter (r=64) specifically fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model. This adapter focuses on significantly improving the base model's capabilities in multi-turn agent tasks, particularly within the domains of household task execution (ALFWorld) and database operations (DBBench).
Key Capabilities
- Enhanced Multi-Turn Agent Performance: The adapter is trained to improve the model's ability to handle complex, sequential tasks requiring multiple interactions.
- Improved Environment Interaction: It enables better learning of environment observation and appropriate action selection.
- Advanced Tool Use: The fine-tuning process emphasizes effective tool utilization within agent trajectories.
- Error Recovery: The model is trained to recover from errors during multi-turn interactions, making it more robust for agentic workflows.
Training Details
This adapter was trained using LoRA (full precision base) with a maximum sequence length of 4096 over 3 epochs. The training objective applied loss to all assistant turns in the multi-turn trajectory, fostering comprehensive learning across the entire interaction sequence. The training data included u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4.
Good For
- Agentic Applications: Ideal for developers building AI agents that need to perform complex, multi-step tasks.
- Automated Task Execution: Suitable for scenarios requiring a model to interact with environments, select tools, and manage multi-turn dialogues to achieve specific goals.
- Research in Agent AI: Provides a fine-tuned component for exploring and developing more capable AI agents in household and database contexts.