cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Jul 31, 2025License:cc-by-nc-4.0Architecture:Transformer Open Weights Warm

The cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct is a 0.5 billion parameter small language model (SLM) based on the Qwen2.5-Coder-0.5B-Instruct architecture, developed by Lei Sheng and Shuai-Shuai Xu. It is specifically fine-tuned for Text-to-SQL tasks, leveraging supervised fine-tuning and reinforcement learning with a corrective self-consistency approach. This model excels at translating natural language questions into SQL queries, achieving 56.87% execution accuracy on the BIRD development set, making it suitable for efficient Text-to-SQL applications where inference speed and edge deployment are critical.

Loading preview...

SLM-SQL: Small Language Models for Text-to-SQL

This model, CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct, is part of the SLM-SQL project by Lei Sheng and Shuai-Shuai Xu, focusing on enhancing small language models (SLMs) for Text-to-SQL tasks. While traditional SLMs (0.5B-1.5B parameters) often struggle with the logical reasoning required for SQL generation, this project demonstrates significant improvements through advanced post-training techniques.

Key Capabilities and Features

  • Text-to-SQL Translation: Optimized to convert natural language questions into accurate SQL queries.
  • Base Model: Built upon the Qwen2.5-Coder-0.5B-Instruct architecture, providing a strong foundation for code-related tasks.
  • Advanced Training: Utilizes supervised fine-tuning (SFT) and reinforcement learning (GRPO) on specialized datasets like SynSQL-Think-916K and SynSQL-Merge-Think-310K.
  • Corrective Self-Consistency: Employs an inference method that enhances SQL generation accuracy by iteratively refining queries.
  • Performance: The 0.5B model achieved 56.87% execution accuracy (EX) on the BIRD development set, demonstrating substantial improvement over baseline SLMs.

Why This Model is Different

Unlike larger LLMs that perform well on Text-to-SQL but demand significant computational resources, SLM-SQL focuses on achieving competitive performance with a much smaller footprint. This model specifically addresses the limitations of SLMs in logical reasoning for SQL by applying targeted fine-tuning and a corrective self-consistency approach, making it highly efficient for scenarios requiring faster inference and deployment on edge devices.

Good for Use Cases Involving

  • Efficient Text-to-SQL: Applications where converting natural language to SQL needs to be fast and resource-light.
  • Edge Deployment: Scenarios where models need to run on devices with limited computational power.
  • Database Interaction: Building interfaces that allow users to query databases using natural language without relying on large, complex models.
  • Research in SLM Capabilities: Exploring the potential of small models in complex reasoning tasks.