moushi21/agent-bench-dbbench-merged4
The moushi21/agent-bench-dbbench-merged4 is a 4 billion parameter Qwen3-Instruct model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. This merged full-parameter model is specialized for DBBench trajectory tasks, excelling at handling multi-turn environment observations and action selections. It is optimized for high-speed inference and easy deployment, making it suitable for agentic workflows requiring database interaction. The model was trained using LoRA fine-tuning, with weights merged back into the base model.
Loading preview...
Overview
The moushi21/agent-bench-dbbench-merged4 is a 4 billion parameter language model, derived from the Qwen3-Instruct architecture, specifically fine-tuned from Qwen/Qwen3-4B-Instruct-2507. Unlike models that require separate LoRA adapters, this version integrates the LoRA weights directly into the base model using Unsloth's merge_and_unload method. This design choice prioritizes high-speed inference and simplified deployment.
Key Capabilities
- Specialized for DBBench Trajectory Tasks: The model's primary focus is on navigating and performing actions within database environments.
- Multi-turn Interaction: It is trained to effectively process and respond to multi-turn environment observations and select appropriate actions.
- Merged Full-Parameter Model: Offers the performance benefits of a fine-tuned model without the overhead of managing separate adapter weights.
- Efficient Inference: Designed for fast execution due to its merged architecture.
Training Details
The model was fine-tuned using LoRA with specific parameters (r=64, alpha=128) over 500 steps, utilizing a learning rate of 5e-07. The training data consisted of several versions of the dbbench_sft_dataset_react from u-10bei, all distributed under the MIT License. The maximum sequence length used during training was 4096 tokens.