Overview
Contextual-SQL Reward Model
ContextualAI/ctx-bird-reward-250121 is a 32.8 billion parameter reward model developed by Contextual AI, fine-tuned from Qwen-2.5-32B-Instruct. It serves as the scoring component of the Contextual-SQL system, which achieved the #1 position on the BIRD benchmark leaderboard in February 2025. This model's primary function is to rank candidate SQL queries by evaluating their execution correctness against a given database schema and natural language query.
Key Capabilities
- SQL Candidate Scoring: Ranks the quality of generated SQL queries based on their correctness.
- Enhanced Text-to-SQL Accuracy: Integrates into a multi-stage pipeline to select the best SQL candidate from multiple generations.
- Finetuned for BIRD Benchmark: Specifically trained on the BIRD dataset using a classification objective with hard negative mining to identify incorrect SQL candidates.
Good For
- Improving Text-to-SQL Systems: Essential for developers building or enhancing text-to-SQL solutions, particularly those aiming for high accuracy on complex datasets.
- Integrating into Multi-Stage Pipelines: Designed to be used as part of a larger system that generates and then validates SQL queries.
- Benchmarking and Research: Useful for researchers and practitioners working on text-to-SQL tasks, especially those interested in reward model approaches.
This model is not a standalone text-to-SQL generator but a crucial component for validating and selecting the most accurate SQL outputs within a comprehensive system. For the complete Contextual-SQL pipeline, including SQL generation, refer to the Contextual-SQL GitHub repository.