moushi21/agent-bench-dbbench-merged4

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The moushi21/agent-bench-dbbench-merged4 is a 4 billion parameter Qwen3-Instruct model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. This merged full-parameter model is specialized for DBBench trajectory tasks, excelling at handling multi-turn environment observations and action selections. It is optimized for high-speed inference and easy deployment, making it suitable for agentic workflows requiring database interaction. The model was trained using LoRA fine-tuning, with weights merged back into the base model.

Loading preview...

Overview

The moushi21/agent-bench-dbbench-merged4 is a 4 billion parameter language model, derived from the Qwen3-Instruct architecture, specifically fine-tuned from Qwen/Qwen3-4B-Instruct-2507. Unlike models that require separate LoRA adapters, this version integrates the LoRA weights directly into the base model using Unsloth's merge_and_unload method. This design choice prioritizes high-speed inference and simplified deployment.

Key Capabilities

  • Specialized for DBBench Trajectory Tasks: The model's primary focus is on navigating and performing actions within database environments.
  • Multi-turn Interaction: It is trained to effectively process and respond to multi-turn environment observations and select appropriate actions.
  • Merged Full-Parameter Model: Offers the performance benefits of a fine-tuned model without the overhead of managing separate adapter weights.
  • Efficient Inference: Designed for fast execution due to its merged architecture.

Training Details

The model was fine-tuned using LoRA with specific parameters (r=64, alpha=128) over 500 steps, utilizing a learning rate of 5e-07. The training data consisted of several versions of the dbbench_sft_dataset_react from u-10bei, all distributed under the MIT License. The maximum sequence length used during training was 4096 tokens.