SLM-SQL: Small Language Models for Text-to-SQL

This model, CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct, is part of the SLM-SQL project by Lei Sheng and Shuai-Shuai Xu, focusing on enhancing small language models (SLMs) for Text-to-SQL tasks. While traditional SLMs (0.5B-1.5B parameters) often struggle with the logical reasoning required for SQL generation, this project demonstrates significant improvements through advanced post-training techniques.

Key Capabilities and Features

Text-to-SQL Translation: Optimized to convert natural language questions into accurate SQL queries.
Base Model: Built upon the Qwen2.5-Coder-0.5B-Instruct architecture, providing a strong foundation for code-related tasks.
Advanced Training: Utilizes supervised fine-tuning (SFT) and reinforcement learning (GRPO) on specialized datasets like SynSQL-Think-916K and SynSQL-Merge-Think-310K.
Corrective Self-Consistency: Employs an inference method that enhances SQL generation accuracy by iteratively refining queries.
Performance: The 0.5B model achieved 56.87% execution accuracy (EX) on the BIRD development set, demonstrating substantial improvement over baseline SLMs.

Why This Model is Different

Unlike larger LLMs that perform well on Text-to-SQL but demand significant computational resources, SLM-SQL focuses on achieving competitive performance with a much smaller footprint. This model specifically addresses the limitations of SLMs in logical reasoning for SQL by applying targeted fine-tuning and a corrective self-consistency approach, making it highly efficient for scenarios requiring faster inference and deployment on edge devices.

Good for Use Cases Involving

Efficient Text-to-SQL: Applications where converting natural language to SQL needs to be fast and resource-light.
Edge Deployment: Scenarios where models need to run on devices with limited computational power.
Database Interaction: Building interfaces that allow users to query databases using natural language without relying on large, complex models.
Research in SLM Capabilities: Exploring the potential of small models in complex reasoning tasks.

Overview

SLM-SQL: Small Language Models for Text-to-SQL

Key Capabilities and Features

Why This Model is Different

Good for Use Cases Involving

Full Model Card (README)