Ary-007/Text-to-sql-llama-3.2: Text-to-SQL Llama-3.2 Fine-tune
This model is a specialized fine-tuned version of the Llama-3.2-3B-Instruct base model, developed by Ary-007. It is specifically designed for Text-to-SQL tasks, translating natural language questions into executable SQL queries and providing a brief explanation of the logic. With 3.2 billion parameters and a 32768 token context length, it is a lightweight solution suitable for deployment on consumer-grade GPUs using 4-bit quantization.
Key Capabilities
- Natural Language to SQL Conversion: Generates SQL queries from natural language questions, given a database schema.
- SQL Explanation: Provides a brief explanation of the generated SQL query's logic.
- Lightweight Deployment: Optimized for efficiency, enabling local deployment on consumer hardware.
- Alpaca Prompt Format: Trained to respond effectively when input is structured using the Alpaca prompt format.
Training Details
The model was fine-tuned using Unsloth (QLoRA) on the gretelai/synthetic_text_to_sql dataset. This dataset provided the necessary sql_context (database schema), sql_prompt (natural language question), sql (target SQL query), and sql_explanation for training. It utilized 4-bit Normal Float4 quantization and was trained for 60 steps as a proof of concept.
Limitations
- Generalization: Due to limited training steps, it may not generalize perfectly to highly complex or entirely unseen database schemas.
- Hallucination Risk: Like other LLMs, it can generate syntactically correct but logically flawed SQL, requiring validation before production use.
- Scope: Optimized for standard SQL dialects similar to SQLite/PostgreSQL, as represented in its training data.