ShogoMu/qwen25_7b_lora_agentbench_v6_e4
ShogoMu/qwen25_7b_lora_agentbench_v6_e4 is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B-Instruct. This model is specifically optimized for multi-turn agent tasks, excelling in environments like ALFWorld for household navigation and DBBench for database operations. It learns intermediate reasoning, observation processing, action selection, and error recovery by applying loss to all assistant turns in multi-turn trajectories. The model features a 32768 token context length and is ready for inference as full merged weights.
Loading preview...
Model Overview
ShogoMu/qwen25_7b_lora_agentbench_v6_e4 is a 7.6 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model. This model was developed using LoRA with Unsloth, and its adapter weights have been merged into the base model, providing full model weights ready for direct inference.
Key Capabilities
- Optimized for Multi-Turn Agent Tasks: Specifically designed to handle complex, multi-step interactions required in agentic workflows.
- Specialized Domains: Demonstrates proficiency in ALFWorld (household navigation and interaction) and DBBench (database operations).
- Enhanced Reasoning: The training methodology applied loss to all assistant turns, enabling the model to learn not only final answers but also intermediate reasoning steps, environmental observation processing, action selection, and error recovery mechanisms.
- Full Model Weights: No separate adapter loading is required, simplifying deployment and usage.
Training Details
The model was trained for 4 epochs with a learning rate of 2e-06, utilizing a maximum sequence length of 2048 tokens. LoRA parameters included r=64 and alpha=128.
Good For
- Developing AI agents that require multi-turn interaction and complex decision-making.
- Applications involving household automation or simulated environment navigation (ALFWorld-like tasks).
- Database interaction and operation tasks (DBBench-like scenarios).
- Use cases where explicit intermediate reasoning and error handling are crucial for agent performance.