ShogoMu/qwen25_7b_lora_agentbench_v11

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ShogoMu/qwen25_7b_lora_agentbench_v11 is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct, optimized for multi-turn agent tasks. This model excels at complex interactive environments like ALFWorld and database operations in DBBench. It learns intermediate reasoning, action selection, and error recovery by applying loss to all assistant turns in multi-turn trajectories, offering enhanced agentic capabilities.

Loading preview...

Overview

ShogoMu/qwen25_7b_lora_agentbench_v11 is a 7.6 billion parameter model derived from Qwen/Qwen2.5-7B-Instruct. It has been fine-tuned using LoRA and Unsloth, with the adapter merged into the base model weights, providing full model weights ready for inference. The model's training specifically targeted multi-turn agent tasks, making it proficient in scenarios requiring sequential decision-making and interaction.

Key Capabilities

  • Multi-turn Agent Optimization: Specifically designed for agentic workflows, learning from entire multi-turn trajectories.
  • Intermediate Reasoning: The training process applied loss to all assistant turns, enabling the model to learn not just final answers but also intermediate thoughts, observation processing, and action selection.
  • Error Recovery: Enhanced ability to handle and recover from errors within complex interactive environments.
  • Specialized Task Performance: Optimized for tasks such as household navigation and interaction (ALFWorld) and database operations (DBBench).

Training Details

The model was trained for 2 epochs with a learning rate of 2e-06, utilizing LoRA parameters of r=64 and alpha=128. It supports a maximum sequence length of 2048 tokens. This specialized training approach differentiates it from general-purpose instruction-tuned models by focusing on the nuances of agentic behavior.