jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_3-cw-29K
This is a 7.6 billion parameter model from jastorj, fine-tuned from Snowflake/Arctic-Text2SQL-R1-7B for Text-to-SQL generation. It specializes in converting natural language queries into SQL++ code, leveraging the NL2SQL++ v8 dataset with code-with-thought reasoning. The model is optimized for generating accurate SQL++ queries based on provided schema and natural language input, making it suitable for database interaction tasks.
Loading preview...
Model Overview
This model, jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_3-cw-29K, is a 7.6 billion parameter variant of the Snowflake/Arctic-Text2SQL-R1-7B base model. It has been specifically fine-tuned for Text-to-SQL generation, focusing on the SQL++ dialect.
Key Capabilities
- Text-to-SQL Conversion: Translates natural language queries into valid SQL++ statements.
- Schema-Aware Generation: Utilizes provided database schema to generate contextually accurate queries.
- Reasoning Integration: Fine-tuned on the NL2SQL++ v8 dataset, which incorporates "code-with-thought" reasoning to improve query generation logic.
- SQL++ Specifics: Adheres to SQL++ syntax and conventions, including specific instructions for backtick-quoting,
SUBSTRusage, type handling (e.g.,TO_NUMBERinstead ofCAST), and alias uniqueness.
Training Details
The model was fine-tuned using LoRA (Low-Rank Adaptation) with Unsloth, and its weights are quantized to 16-bit. The training dataset comprised 10,055 examples, with an additional 1,062 examples for validation. It was trained for 3 epochs with a learning rate of 1e-05 and a maximum sequence length of 29000 tokens.
Good For
- Applications requiring automated SQL++ query generation from natural language.
- Developers working with Couchbase or other SQL++ compatible databases who need to streamline query creation.
- Use cases where precise and schema-compliant SQL++ is critical.