griffith-bigdata/Qwen3-4B-SQL-Writer
The griffith-bigdata/Qwen3-4B-SQL-Writer is a 4 billion parameter Qwen3-based instruction-tuned language model, fine-tuned specifically for SQL generation tasks. This model excels at translating natural language queries into SQL code, leveraging its 32768 token context length for complex database interactions. It is optimized for developers requiring accurate and efficient SQL writing capabilities from natural language prompts.
Loading preview...
Qwen3-4B-SQL-Writer: SQL Generation from Natural Language
This model, developed by griffith-bigdata, is a specialized fine-tuned version of the Qwen3-4B-Instruct base model. It has been specifically optimized for text-to-SQL generation, making it highly effective at converting natural language instructions into executable SQL queries.
Key Capabilities
- SQL Code Generation: Translates natural language prompts into accurate SQL statements.
- Qwen3 Architecture: Built upon the robust Qwen3-4B-Instruct foundation, providing strong language understanding.
- Optimized for Database Interaction: Fine-tuned on a dedicated
sft_text2sql_v2dataset to enhance SQL writing proficiency. - Large Context Window: Features a 32768 token context length, allowing for more complex and detailed SQL generation tasks.
Training Details
The model was trained with a learning rate of 2e-05 over 2 epochs, utilizing a batch size of 2 and accumulating gradients over 64 steps, resulting in an effective total batch size of 128. The training process used the AdamW optimizer with a cosine learning rate scheduler.
Good For
- Automating SQL Query Writing: Ideal for applications that need to dynamically generate SQL based on user input.
- Database Interaction Tools: Can be integrated into tools that simplify database querying for non-technical users.
- Developer Productivity: Assists developers in quickly drafting complex SQL queries.