moushi21/agent-bench-merged12

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

moushi21/agent-bench-merged12 is a 4 billion parameter Qwen3-based model, created by moushi21, specifically optimized for agentic tasks. This model was developed using the TIES-Merging method to combine specialized LoRA adapters for ALFWorld and DBBench (SQL) tasks. It demonstrates balanced performance on agentic benchmarks, achieving 0.60 Pass@1 on ALFWorld and 0.5353 on DBBench. The model is designed for direct inference in agent-based applications requiring strong reasoning in interactive environments and database interactions.

Loading preview...

Overview

moushi21/agent-bench-merged12 is a 4 billion parameter model built upon the Qwen/Qwen3-4B-Instruct-2507 base, specifically engineered for enhanced performance in agentic tasks. This model was created by merging specialized LoRA adapters using the TIES-Merging method via Mergekit, integrating expertise from ALFWorld trajectories and DBBench (SQL) tasks.

Key Capabilities

  • Agentic Task Optimization: Fine-tuned to excel in complex agent environments, particularly those involving interactive decision-making and database querying.
  • Specialized Merging: Utilizes the TIES-Merging method to combine distinct LoRA adapters, ensuring a balanced integration of specialized skills.
  • Direct Inference: Provided as full model weights, eliminating the need to load separate adapters for deployment.

Performance Highlights

The model exhibits strong performance across its target agentic benchmarks:

  • ALFWorld: Achieves a Pass@1 score of 0.60.
  • DBBench: Scores 0.5353.

Training Data & Licensing

The source models were fine-tuned on specific datasets:

  • ALFWorld: u-10bei/sft_alfworld_trajectory_dataset (v1 to v5)
  • DBBench: u-10bei/dbbench_sft_dataset_react (v1 to v4)
    Both datasets are distributed under the MIT License. Users must comply with the MIT license and the base model's original terms of use.