Mountaingorillas/Qwen-2.5-7B-Instruct-Agentbench-lora-MixedLearning-v2
Mountaingorillas/Qwen-2.5-7B-Instruct-Agentbench-lora-MixedLearning-v2 is a 7.6 billion parameter instruction-tuned model, fine-tuned from Qwen/Qwen2.5-7B-Instruct, with a 32K context length. It is specifically optimized for multi-turn agent tasks, excelling in environments like ALFWorld and DBBench. The model utilizes a Hybrid Reasoning Schema (Data Mixing) to seamlessly switch between ReAct for database operations and native Function Calling for embodied tasks, ensuring strict adherence to task-specific formats.
Loading preview...
Model Overview
This model, developed by Mountaingorillas, is a fully merged fine-tune of Qwen/Qwen2.5-7B-Instruct, featuring 7.6 billion parameters and a 32K context window. Unlike adapter-only versions, it can be loaded directly. Its core innovation lies in its optimization for multi-turn agent tasks, particularly for the LLM2025 Agent competition, targeting ALFWorld (household tasks) and DBBench (database operations).
Key Capabilities & Innovations
- Hybrid Reasoning Schema (Data Mixing): The model is trained to dynamically adapt its inference format based on the prompt context, preventing common agentic failure modes like parsing errors.
- DBBench Optimization: Strictly adheres to the
ReActformat for database operations, ensuring precise SQL string syntax. - ALFWorld Optimization: Employs native
Function Calling(tool_callsfor theactfunction) for robust environment interactions, avoiding invalid action errors. - Multi-turn Learning: Loss is applied across all assistant turns in a trajectory, enhancing its ability to learn observation, action selection, tool use, and error recovery.
Training Details
- Base Model:
unsloth/Qwen2.5-7B-Instruct - Method: LoRA (merged into full weights)
- Epochs: 3
- Max Sequence Length: 3072
Good For
- Developing and testing agents for complex, multi-turn tasks.
- Applications requiring precise adherence to structured output formats (e.g., SQL generation, function calls).
- Research into agentic AI and hybrid reasoning strategies.