Mountaingorillas/Qwen-2.5-7B-Instruct-Agentbench-lora-MixedLearning-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Mountaingorillas/Qwen-2.5-7B-Instruct-Agentbench-lora-MixedLearning-v2 is a 7.6 billion parameter instruction-tuned model, fine-tuned from Qwen/Qwen2.5-7B-Instruct, with a 32K context length. It is specifically optimized for multi-turn agent tasks, excelling in environments like ALFWorld and DBBench. The model utilizes a Hybrid Reasoning Schema (Data Mixing) to seamlessly switch between ReAct for database operations and native Function Calling for embodied tasks, ensuring strict adherence to task-specific formats.

Loading preview...

Model Overview

This model, developed by Mountaingorillas, is a fully merged fine-tune of Qwen/Qwen2.5-7B-Instruct, featuring 7.6 billion parameters and a 32K context window. Unlike adapter-only versions, it can be loaded directly. Its core innovation lies in its optimization for multi-turn agent tasks, particularly for the LLM2025 Agent competition, targeting ALFWorld (household tasks) and DBBench (database operations).

Key Capabilities & Innovations

  • Hybrid Reasoning Schema (Data Mixing): The model is trained to dynamically adapt its inference format based on the prompt context, preventing common agentic failure modes like parsing errors.
  • DBBench Optimization: Strictly adheres to the ReAct format for database operations, ensuring precise SQL string syntax.
  • ALFWorld Optimization: Employs native Function Calling (tool_calls for the act function) for robust environment interactions, avoiding invalid action errors.
  • Multi-turn Learning: Loss is applied across all assistant turns in a trajectory, enhancing its ability to learn observation, action selection, tool use, and error recovery.

Training Details

  • Base Model: unsloth/Qwen2.5-7B-Instruct
  • Method: LoRA (merged into full weights)
  • Epochs: 3
  • Max Sequence Length: 3072

Good For

  • Developing and testing agents for complex, multi-turn tasks.
  • Applications requiring precise adherence to structured output formats (e.g., SQL generation, function calls).
  • Research into agentic AI and hybrid reasoning strategies.