jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K model is a 7.6 billion parameter fine-tuned version of Snowflake's Arctic-Text2SQL-R1-7B, specifically optimized for Text-to-SQL generation. It was fine-tuned using LoRA on the NL2SQL++ v8 dataset, incorporating code-with-thought reasoning. This model excels at converting natural language queries into SQL commands, supporting a maximum sequence length of 32768 tokens.

Loading preview...

Model Overview

This model, jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K, is a specialized 7.6 billion parameter language model derived from Snowflake/Arctic-Text2SQL-R1-7B. Its primary function is Text-to-SQL generation, converting natural language questions into executable SQL queries.

Key Capabilities & Features

  • Text-to-SQL Generation: Directly translates natural language into SQL.
  • Fine-tuned for NL2SQL++ v8: Optimized on a specific dataset known for complex SQL generation tasks, including code-with-thought reasoning.
  • LoRA Fine-tuning: Utilizes Low-Rank Adaptation (LoRA) with Unsloth for efficient fine-tuning.
  • Quantization: Features 16-bit merged weights for potentially reduced memory footprint.
  • Extended Context Window: Supports a maximum sequence length of 32768 tokens, allowing for more complex schema and query contexts.

Training Details

The model was trained for 2 epochs with a learning rate of 0.0002, using an effective batch size of 128. LoRA parameters included a rank of 64 and an alpha of 128, targeting key attention and feed-forward modules. The training dataset comprised 46344 examples, with an additional 1986 examples for validation.

Ideal Use Cases

This model is particularly well-suited for applications requiring accurate and robust conversion of user-provided natural language questions into SQL queries, especially in scenarios involving complex database schemas or requiring reasoning capabilities for SQL construction.