bsq1989/qwen_4b_sql

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Cold

The bsq1989/qwen_4b_sql is a 4 billion parameter Qwen3-4B-Base model, specifically fine-tuned for text-to-SQL generation. It excels at converting natural language questions and database schemas into SQL queries, leveraging full SFT on a cleaned split of the PipableAI/pip-txt-to-sql-spider-bird-dataset. This model demonstrates stronger performance in execution accuracy on the Spider benchmark compared to its 1.7B parameter counterpart, making it suitable for schema-conditioned SQL generation experiments.

Loading preview...

Model Overview

bsq1989/qwen_4b_sql is a 4 billion parameter model based on Qwen/Qwen3-4B-Base, specifically fine-tuned for text-to-SQL generation. It was trained using LLaMA-Factory with full SFT on a cleaned version of the PipableAI/pip-txt-to-sql-spider-bird-dataset.

Key Capabilities & Performance

  • Text-to-SQL Generation: Converts natural language questions and provided database schemas into SQL queries.
  • Spider Benchmark: Achieves a 67.6% Test Suite execution accuracy and 35.0% exact match on the Spider dev set.
  • Performance Comparison: Outperforms the Qwen3-1.7B-Base model in execution accuracy on the Spider benchmark, indicating a stronger capability for generating executable SQL.
  • Training Details: Trained on a single NVIDIA H20 96GB GPU with bf16 precision and a context length of 2048.

Intended Use Cases

  • Text-to-SQL Research Baselines: Ideal for establishing baseline performance in text-to-SQL research.
  • Schema-Conditioned SQL Generation: Suitable for experiments requiring SQL generation based on explicit schema definitions.
  • Single-Turn SQL Generation: Designed for generating SQL from a single natural language query paired with schema text.

Limitations

  • Performance may decrease on more diverse or open-ended SQL benchmarks beyond Spider.
  • Can occasionally produce invalid column references when encountering out-of-distribution schemas.
  • Not validated for production-grade database access control or complex multi-turn agent workflows without additional tooling.