Struct-SQL: Structured Reasoning for Text-to-SQL
Struct-SQL is a specialized 4 billion parameter Text-to-SQL model developed by craterlabs, built upon the Qwen3-4B-Instruct-2507 architecture. Its core innovation lies in a novel Knowledge Distillation (KD) framework that transfers structured reasoning, specifically Query Execution Plans (QEPs), from a powerful teacher LLM (GPT-4o) to this smaller student model. Unlike traditional unstructured Chain-of-Thought (CoT) distillation, Struct-SQL learns to generate a formal, logical blueprint before producing the final SQL query.
Key Capabilities
- Structured Reasoning: Learns to generate explicit Query Execution Plans, improving logical coherence.
- Reduced Errors: Significantly minimizes syntactic errors and schema hallucinations in generated SQL.
- Enhanced Accuracy: Achieves an Execution Accuracy (EX) of 45.0% on the BIRD mini-dev benchmark, outperforming unstructured CoT baselines by 8.1 points.
- Efficient Distillation: Effectively transfers complex reasoning from larger models to a compact 4B parameter model.
Good for
- Research and Academic Use: Ideal for studying knowledge distillation with structured intermediate representations.
- Text-to-SQL Generation: Excels in semantic parsing over relational databases.
- Error Reduction Studies: Useful for investigating methods to improve SQL validity and schema grounding.
- Compact Reasoning Models: Demonstrates complex reasoning capabilities within a limited parameter budget.
This model is primarily intended for research and academic exploration, not for direct production deployment without further validation.