Snowflake/Llama-3.1-Arctic-ExCoT-70B
Snowflake's Llama-3.1-Arctic-ExCoT-70B is a 70 billion parameter model, part of the Arctic Text2SQL family, specifically fine-tuned for Text-to-SQL tasks. It utilizes the novel ExCoT (Execution-Guided Chain-of-Thought) framework, combining CoT prompting with SQL execution-based DPO for scalable optimization. This model excels at converting natural language queries into SQL, achieving state-of-the-art performance on the BIRD-test benchmark by leveraging execution results as feedback.
Loading preview...
Arctic Text2SQL: ExCoT Overview
Snowflake's Llama-3.1-Arctic-ExCoT-70B is a 70 billion parameter model designed for advanced Text-to-SQL capabilities. It introduces the ExCoT (Execution-Guided Chain-of-Thought) framework, a novel approach that optimizes model performance by using SQL execution results, rather than human preferences, as the feedback signal for DPO (Direct Preference Optimization). This method enables highly scalable and high-quality model optimization without the need for expensive human annotations.
Key Capabilities & Performance
- State-of-the-Art Text-to-SQL: Achieved best-in-class performance on the BIRD-test benchmark in the single-model, single-inference category.
- Significant Improvement: Improved execution accuracy on the BIRD-dev set from the base Llama 3.1 70B model's 57.37% to 68.51%.
- Outperforms Frontier Models: Demonstrated over 10 points of improvement compared to other well-known general-purpose models like GPT-4o, Claude 3.5-Sonnet, and Mistral-large-2407.
- Data Efficiency: Optimized using only public datasets (BIRD and Spider) and no additional Text2SQL data.
Good For
- Complex Text-to-SQL Conversion: Ideal for applications requiring accurate translation of natural language into SQL queries.
- Database Interaction: Enhancing user interfaces for database querying without requiring SQL expertise.
- Benchmarking Text-to-SQL Models: Provides a strong baseline and competitive performance for research and development in the Text-to-SQL domain.