thanhdath/FINER-SQL-0.5B-Spider

TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Apr 29, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The thanhdath/FINER-SQL-0.5B-Spider is a 0.5 billion parameter Text-to-SQL model, fine-tuned from griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer with GRPO and FINER-SQL dense rewards. This model achieves 75.0% execution accuracy on the Spider Dev dataset, demonstrating strong performance for its size. It is specifically optimized for generating accurate SQL queries from natural language questions, making it suitable for database interaction tasks.

Loading preview...

FINER-SQL-0.5B-Spider: A Specialized Text-to-SQL Model

This model, developed by thanhdath, is a compact yet powerful 0.5 billion parameter Text-to-SQL solution. It is fine-tuned from griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer using the GRPO algorithm and incorporates FINER-SQL dense rewards (Memory + Atomic) to enhance its SQL generation capabilities.

Key Capabilities

  • High Accuracy for its Size: Achieves 75.0% Execution Accuracy on the Spider Dev dataset (n=30, value-aware voting), making it a highly effective small-scale Text-to-SQL model.
  • Efficient Resource Usage: Designed to run efficiently on GPUs with 4-8 GB of memory.
  • Specialized Fine-tuning: Demonstrates that dataset-specific specialization significantly improves performance, outperforming the 0.5B BIRD model by 6.4 percentage points on Spider Dev.
  • Robust Inference Pipeline: Recommends a pipeline involving generating 30 candidates with temperature 1.0, executing them, and applying value-aware voting (vav) for optimal results.

Good For

  • Text-to-SQL Applications: Ideal for converting natural language questions into executable SQL queries, particularly for SQLite databases.
  • Resource-Constrained Environments: Suitable for deployment where computational resources are limited, thanks to its small parameter count.
  • Spider Dataset Tasks: Specifically optimized for tasks related to the Spider dataset, showing strong performance across various query hardness levels.
  • Developers needing SQL generation: Provides a reliable and accurate tool for automating SQL query creation from user input.