jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_1-cw-5K

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 16, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

This model is a 7.6 billion parameter fine-tuned version of Snowflake's Arctic-Text2SQL-R1-7B, specifically optimized for Text-to-SQL generation. It was fine-tuned by jastorj on the NL2SQL++ v8 dataset, incorporating code-with-thought reasoning to enhance its ability to generate accurate SQL++ queries from natural language. The model excels at analyzing SQL errors and providing corrected queries, making it highly effective for database interaction and debugging.

Loading preview...

Overview

This model, jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_1-cw-5K, is a specialized 7.6 billion parameter language model derived from Snowflake's Arctic-Text2SQL-R1-7B. It has been fine-tuned using LoRA (Low-Rank Adaptation) with Unsloth, specifically targeting Text-to-SQL generation for SQL++.

Key Capabilities

  • Text-to-SQL Generation: Translates natural language queries into valid SQL++ queries.
  • Error Analysis and Correction: Capable of analyzing SQL++ query errors and providing corrected versions, as demonstrated by its training on the NL2SQL++ v8 dataset with code-with-thought reasoning.
  • Contextual Understanding: Incorporates detailed instructions and schema information to generate precise and contextually appropriate SQL++ queries.

Training Details

The model was fine-tuned on the NL2SQL++ v8 dataset, which includes 39,243 training examples and 1,062 validation examples. It utilizes 16-bit merged weights and was trained with a maximum sequence length of 5500 tokens, making it robust for complex SQL generation tasks. The training configuration emphasizes learning from error feedback to improve query accuracy and adherence to SQL++ syntax and best practices.