jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_3-cw-29K

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

This is a 7.6 billion parameter model from jastorj, fine-tuned from Snowflake/Arctic-Text2SQL-R1-7B for Text-to-SQL generation. It specializes in converting natural language queries into SQL++ code, leveraging the NL2SQL++ v8 dataset with code-with-thought reasoning. The model is optimized for generating accurate SQL++ queries based on provided schema and natural language input, making it suitable for database interaction tasks.

Loading preview...

Model Overview

This model, jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_3-cw-29K, is a 7.6 billion parameter variant of the Snowflake/Arctic-Text2SQL-R1-7B base model. It has been specifically fine-tuned for Text-to-SQL generation, focusing on the SQL++ dialect.

Key Capabilities

  • Text-to-SQL Conversion: Translates natural language queries into valid SQL++ statements.
  • Schema-Aware Generation: Utilizes provided database schema to generate contextually accurate queries.
  • Reasoning Integration: Fine-tuned on the NL2SQL++ v8 dataset, which incorporates "code-with-thought" reasoning to improve query generation logic.
  • SQL++ Specifics: Adheres to SQL++ syntax and conventions, including specific instructions for backtick-quoting, SUBSTR usage, type handling (e.g., TO_NUMBER instead of CAST), and alias uniqueness.

Training Details

The model was fine-tuned using LoRA (Low-Rank Adaptation) with Unsloth, and its weights are quantized to 16-bit. The training dataset comprised 10,055 examples, with an additional 1,062 examples for validation. It was trained for 3 epochs with a learning rate of 1e-05 and a maximum sequence length of 29000 tokens.

Good For

  • Applications requiring automated SQL++ query generation from natural language.
  • Developers working with Couchbase or other SQL++ compatible databases who need to streamline query creation.
  • Use cases where precise and schema-compliant SQL++ is critical.