XGenerationLab/XiYanSQL-QwenCoder-3B-2504

Warm
Public
3.1B
BF16
32768
Apr 28, 2025
License: apache-2.0
Hugging Face
Overview

Overview

The XiYanSQL-QwenCoder-3B-2504 is a 3.1 billion parameter model from XGenerationLab, specifically designed for Text-to-SQL generation. It leverages a combination of fine-tuning and GRPO (Gradient-based Reinforcement Learning with Policy Optimization) training strategies to enhance both efficiency and accuracy in SQL query generation. The model supports multiple SQL dialects, including SQLite, PostgreSQL, and MySQL, and is noted for its improved generalization capabilities across different dialects and out-of-domain datasets.

Key Capabilities

  • Multi-dialect SQL Generation: Capable of generating SQL for SQLite, PostgreSQL, and MySQL.
  • Enhanced Performance: Utilizes fine-tuning and GRPO training for optimized SQL generation without a complex thinking process.
  • Strong Generalization: Excels on various dialects and out-of-domain datasets, as validated by internal benchmarks.
  • Real-world Benchmarking: Evaluated against a real-world SQL benchmark (DW test set) comprising thousands of complex queries from PostgreSQL and MySQL scenarios.

Performance Highlights

The 3B model demonstrates competitive performance on benchmarks like BIRD Dev, Spider Test, and DW datasets. For instance, the XiYanSQL-QwenCoder-3B-2504 achieves 55.08% on BIRD Dev@M-Schema and 84.10% on Spider Test@M-Schema, showcasing its robust capabilities in SQL generation compared to previous versions and other models in its size class.

Good for

  • Developers needing to convert natural language questions into SQL queries for SQLite, PostgreSQL, or MySQL databases.
  • Applications requiring efficient and accurate SQL generation with good generalization across diverse database schemas and data types.
  • Use cases where a smaller, specialized model for SQL generation is preferred over larger, general-purpose LLMs.