SLM-SQL-0.6B: Small Language Model for Text-to-SQL
SLM-SQL-0.6B is a 0.8 billion parameter model developed by cycloneboy, specifically engineered to excel in Text-to-SQL tasks. Unlike larger language models (LLMs) that inherently perform well in this domain, SLMs typically struggle with the logical reasoning required for accurate SQL generation. This model addresses this challenge through a novel approach involving supervised fine-tuning (SFT) and reinforcement learning (GRPO) on specialized datasets like SynSQL-Think-916K and SynSQL-Merge-Think-310K.
Key Capabilities
- Enhanced Text-to-SQL Performance: Achieves significant improvements in execution accuracy (EX) on benchmarks like BIRD, with the 0.5B variant reaching 56.87% EX and the 1.5B variant achieving 67.08% EX.
- Corrective Self-Consistency: Utilizes an inference method that revises and refines generated SQL queries, boosting accuracy.
- Optimized for Small Models: Demonstrates that SLMs can achieve competitive performance in Text-to-SQL, offering benefits in inference speed and suitability for deployment on resource-constrained environments.
- Leverages Qwen3-0.6B Base: Built upon the Qwen3-0.6B architecture, further specialized for SQL generation.
Good for
- Applications requiring efficient natural language to SQL conversion.
- Edge deployment scenarios where inference speed and model size are critical.
- Developers looking for a specialized, high-performing Text-to-SQL solution within the small language model category.