melon1891/agentbench-qwen3-4b-lr5e6-20260224v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 24, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The melon1891/agentbench-qwen3-4b-lr5e6-20260224v2 is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B-Instruct-2507. It is specifically optimized for multi-turn agent task performance, focusing on household tasks (ALFWorld) and database operations (DBBench). This model excels at learning environment observation, action selection, tool use, and error recovery within complex multi-turn trajectories, making it suitable for autonomous agent applications.

Loading preview...

Overview

melon1891/agentbench-qwen3-4b-lr5e6-20260224v2 is a 4 billion parameter language model, fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model. This model leverages LoRA (merged into the base) and Unsloth for efficient training, with a focus on enhancing its capabilities for complex, multi-turn agentic tasks.

Key Capabilities

  • Multi-turn Agent Task Performance: Specifically trained to improve performance in scenarios requiring sequential decision-making and interaction.
  • Environment Interaction: Designed to learn from environment observations and select appropriate actions.
  • Tool Use: Optimized for effective integration and utilization of tools within agent workflows.
  • Error Recovery: Capable of learning to recover from errors encountered during multi-turn trajectories.
  • Specialized Training: Loss is applied to all assistant turns, enabling comprehensive learning across an entire multi-turn interaction.

Good for

  • Autonomous Agents: Ideal for developing agents that need to perform household tasks (e.g., ALFWorld) or database operations (e.g., DBBench).
  • Complex Task Automation: Suitable for applications requiring models to manage multi-step processes, interact with environments, and utilize tools.
  • Research in Agentic AI: Provides a specialized base for further experimentation and development in agent-based language models.