koguma-ai/dbbench-combined-baseline0301

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The koguma-ai/dbbench-combined-baseline0301 is a 7.6 billion parameter instruction-tuned causal language model, fine-tuned from Qwen2.5-7B-Instruct. Developed by koguma-ai, this model is specifically optimized for database operation tasks, demonstrating improved performance on the DB Bench evaluation benchmark. It leverages a 32768-token context length and is designed for SQL generation, action selection, and error recovery within database environments.

Loading preview...

Overview

This model, koguma-ai/dbbench-combined-baseline0301, is a 7.6 billion parameter instruction-tuned language model based on Qwen2.5-7B-Instruct. It was created by koguma-ai through a LoRA fine-tuning process, subsequently merged to a 16-bit full-weight model. The primary objective of its training was to significantly enhance performance on DB Bench (database operation) tasks within the AgentBench evaluation framework.

Key Capabilities

  • Database Operation Optimization: Specifically trained to improve performance in database-related tasks, including SQL generation, action selection, and error recovery.
  • Multi-Turn Trajectory Learning: Loss is applied across all assistant turns in multi-turn interactions, enabling robust learning of complex database sequences.
  • Fine-tuned from Qwen2.5-7B-Instruct: Benefits from the strong base capabilities of the Qwen2.5-7B-Instruct model.

Training Details

The model was fine-tuned using approximately 3,000 samples from the DB Bench v1-v4 datasets (u-10bei/dbbench_sft_dataset_react). Training excluded ALFWorld data to preserve the base model's inherent capabilities in that domain. It utilized a maximum sequence length of 2048, trained for 2 epochs with a learning rate of 2e-6, and employed LoRA with r=64 and alpha=128.

Limitations

  • Specialized Optimization: Primarily optimized for DB Bench tasks; performance on other domains like ALFWorld relies solely on the base model's capabilities.
  • Identified Weaknesses: Exhibits weaker performance in specific categories such as aggregation-MAX (16.7%) and INSERT (33.3%).

Good For

  • Developers and researchers focused on database interaction and automation.
  • Applications requiring SQL generation, database action planning, and error handling within structured environments.