SLM-SQL: Small Language Models for Text-to-SQL
SLM-SQL-Base-1.5B is part of the SLM-SQL project by cycloneboy, focusing on enhancing the Text-to-SQL capabilities of Small Language Models (SLMs). This 1.5 billion parameter model, built upon Qwen2.5-Coder-1.5B-Instruct, addresses the challenge of SLMs underperforming in Text-to-SQL due to limited logical reasoning, while capitalizing on their advantages in inference speed and edge deployment.
Key Capabilities
- Specialized Text-to-SQL Translation: Fine-tuned specifically for converting natural language questions into SQL queries.
- Enhanced Performance for SLMs: Achieves significantly improved execution accuracy (EX) on Text-to-SQL benchmarks compared to general-purpose SLMs.
- Efficient Inference: Designed for faster inference and suitability for deployment in resource-constrained environments.
- Supervised Fine-Tuning (SFT): Utilizes the SynSQL-2.5M dataset, including SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision.
Good for
- Text-to-SQL Applications: Ideal for scenarios requiring accurate translation of natural language to SQL.
- Edge Device Deployment: Suitable for applications where computational resources are limited, benefiting from the model's smaller size.
- Rapid SQL Generation: Provides a faster alternative to larger LLMs for generating SQL queries from text.
- Research in SLM Capabilities: Useful for exploring and advancing the performance of small language models in complex reasoning tasks like Text-to-SQL.