SLM-SQL: Small Language Models for Text-to-SQL
SLM-SQL-Base-0.6B is part of the SLM-SQL project by cycloneboy, focusing on enhancing small language models (SLMs) for Text-to-SQL tasks. While traditional SLMs (0.5B-1.5B parameters) often struggle with the logical reasoning required for SQL generation, this model demonstrates that with targeted post-training techniques, SLMs can achieve strong performance. This particular model, with 0.8 billion parameters, is built upon the Qwen3-0.6B base and has a context length of 32768 tokens.
Key Capabilities
- Optimized Text-to-SQL: Specifically fine-tuned to translate natural language questions into SQL queries.
- Enhanced Reasoning: Utilizes supervised fine-tuning (SFT) on the SynSQL-Think-916K dataset and a corrective self-consistency approach to improve SQL generation accuracy.
- Efficient Deployment: Designed for scenarios requiring faster inference speeds and suitability for edge device deployment, leveraging the inherent advantages of SLMs.
- Competitive Performance: Achieved significant improvements on the BIRD development set, with the 0.5B model reaching 56.87% execution accuracy (EX) and the 1.5B model achieving 67.08% EX, validating the effectiveness of the SLM-SQL method.
Good for
- Developing Text-to-SQL applications where inference speed and resource efficiency are critical.
- Deploying SQL generation capabilities on edge devices or environments with limited computational resources.
- Researchers and developers exploring advanced post-training techniques for small language models in specialized domains.