bsq1989/qwen_4b_sql
The bsq1989/qwen_4b_sql is a 4 billion parameter Qwen3-4B-Base model, specifically fine-tuned for text-to-SQL generation. It excels at converting natural language questions and database schemas into SQL queries, leveraging full SFT on a cleaned split of the PipableAI/pip-txt-to-sql-spider-bird-dataset. This model demonstrates stronger performance in execution accuracy on the Spider benchmark compared to its 1.7B parameter counterpart, making it suitable for schema-conditioned SQL generation experiments.
Loading preview...
Model Overview
bsq1989/qwen_4b_sql is a 4 billion parameter model based on Qwen/Qwen3-4B-Base, specifically fine-tuned for text-to-SQL generation. It was trained using LLaMA-Factory with full SFT on a cleaned version of the PipableAI/pip-txt-to-sql-spider-bird-dataset.
Key Capabilities & Performance
- Text-to-SQL Generation: Converts natural language questions and provided database schemas into SQL queries.
- Spider Benchmark: Achieves a 67.6% Test Suite execution accuracy and 35.0% exact match on the Spider dev set.
- Performance Comparison: Outperforms the
Qwen3-1.7B-Basemodel in execution accuracy on the Spider benchmark, indicating a stronger capability for generating executable SQL. - Training Details: Trained on a single NVIDIA H20 96GB GPU with
bf16precision and a context length of 2048.
Intended Use Cases
- Text-to-SQL Research Baselines: Ideal for establishing baseline performance in text-to-SQL research.
- Schema-Conditioned SQL Generation: Suitable for experiments requiring SQL generation based on explicit schema definitions.
- Single-Turn SQL Generation: Designed for generating SQL from a single natural language query paired with schema text.
Limitations
- Performance may decrease on more diverse or open-ended SQL benchmarks beyond Spider.
- Can occasionally produce invalid column references when encountering out-of-distribution schemas.
- Not validated for production-grade database access control or complex multi-turn agent workflows without additional tooling.