Overview
ShogoMu/qwen25_7b_lora_agentbench_v11 is a 7.6 billion parameter model derived from Qwen/Qwen2.5-7B-Instruct. It has been fine-tuned using LoRA and Unsloth, with the adapter merged into the base model weights, providing full model weights ready for inference. The model's training specifically targeted multi-turn agent tasks, making it proficient in scenarios requiring sequential decision-making and interaction.
Key Capabilities
- Multi-turn Agent Optimization: Specifically designed for agentic workflows, learning from entire multi-turn trajectories.
- Intermediate Reasoning: The training process applied loss to all assistant turns, enabling the model to learn not just final answers but also intermediate thoughts, observation processing, and action selection.
- Error Recovery: Enhanced ability to handle and recover from errors within complex interactive environments.
- Specialized Task Performance: Optimized for tasks such as household navigation and interaction (ALFWorld) and database operations (DBBench).
Training Details
The model was trained for 2 epochs with a learning rate of 2e-06, utilizing LoRA parameters of r=64 and alpha=128. It supports a maximum sequence length of 2048 tokens. This specialized training approach differentiates it from general-purpose instruction-tuned models by focusing on the nuances of agentic behavior.