XiYanSQL-QwenCoder-7B-2502: Specialized Text-to-SQL Model
The XiYanSQL-QwenCoder-7B-2502 is part of the XiYanSQL-QwenCoder series by XGenerationLab, focusing on advancing large language models for text-to-SQL generation. This 7.6 billion parameter model is optimized for converting natural language questions into SQL queries across various database dialects.
Key Capabilities
- High Performance: The series, including the 32B variant, has achieved state-of-the-art results on the BIRD TEST set with a 69.03% EX score, and the 7B model shows competitive performance on BIRD and Spider benchmarks.
- Multi-Dialect Support: It supports mainstream SQL dialects such as SQLite, PostgreSQL, and MySQL, offering flexibility for diverse database environments.
- Flexible Usage: Can be directly applied to text-to-SQL tasks or serve as a robust foundation for further fine-tuning to specific SQL generation needs.
Performance Highlights
On the BIRD Dev@M-Schema benchmark, the XiYanSQL-QwenCoder-7B achieved 59.78%, and on Spider Test@M-Schema, it scored 84.86%. These figures position it strongly against other large models in its class. The model is designed to work effectively with both M-Schema and DDL schema formats.
Good For
- Developers requiring accurate and efficient SQL generation from natural language.
- Projects involving database interaction where automated query creation is beneficial.
- As a base model for fine-tuning on custom text-to-SQL datasets.