MistralSQL-7B Overview
MistralSQL-7B is a 7 billion parameter instruction-tuned model developed by bugdaryan, building upon the mistralai/Mistral-7B-Instruct-v0.1 base model. Its primary function is to translate natural language questions into SQLite queries, leveraging provided database schemas for accurate context.
Key Capabilities
- Text-to-SQL Generation: Excels at converting natural language questions into executable SQLite queries.
- Schema-Aware Querying: Utilizes
CREATE TABLE statements as context to generate precise SQL queries. - Instruction Following: Designed to respond to prompts formatted with
[INST] and [/INST] tokens, ensuring clear instruction adherence.
Training and Dataset
The model was fine-tuned using the bugdaryan/sql-create-context-instruction dataset, which comprises 78,577 examples derived from WikiSQL and Spider. This dataset includes natural language queries, SQL CREATE TABLE statements, and corresponding SQL queries, specifically designed to enhance the model's understanding of database schemas and query generation. Training was conducted on 2 RTX A6000 48GB GPUs, utilizing LoRA with an attention dimension of 64 and a dropout probability of 0.1, and 4-bit precision base model loading.
Ideal Use Cases
- Database Interaction: Automating SQL query generation for SQLite databases based on user questions.
- Data Analysis: Assisting users in querying data without requiring deep SQL knowledge.
- Application Development: Integrating natural language interfaces for database operations within applications.