ShacharNar/qwen2.5_coder_3b_sqlfuse_probgate_only_answerable_delimeters_eos
ShacharNar/qwen2.5_coder_3b_sqlfuse_probgate_only_answerable_delimeters_eos is a 3.1 billion parameter language model, fine-tuned from Qwen/Qwen2.5-Coder-3B. This model specializes in text-to-SQL tasks, leveraging its base architecture for code-related applications. It is optimized for generating SQL queries from natural language prompts, making it suitable for database interaction and data retrieval use cases. The model has a context length of 32768 tokens, supporting complex SQL generation scenarios.
Loading preview...
Overview
This model, ShacharNar/qwen2.5_coder_3b_sqlfuse_probgate_only_answerable_delimeters_eos, is a fine-tuned variant of the 3.1 billion parameter Qwen/Qwen2.5-Coder-3B architecture. Its primary specialization is in text-to-SQL conversion, designed to translate natural language queries into executable SQL statements.
Key Capabilities
- Text-to-SQL Generation: Excels at converting natural language into SQL queries, a critical function for database interaction.
- Code-Oriented Base: Built upon a coder-specific foundation, enhancing its ability to understand and generate structured code like SQL.
- Fine-tuned Performance: Achieved a validation loss of 0.2362 during its 5-epoch training, indicating effective learning for its specialized task.
- Large Context Window: Supports a context length of 32768 tokens, allowing for more complex and detailed input prompts for SQL generation.
Training Details
The model was trained using a learning rate of 5e-05, a batch size of 1, and the AdamW optimizer over 5 epochs. Training utilized Native AMP for mixed-precision training. The training process was completed on 2025-11-28, leveraging an NVIDIA A100 80GB PCIe GPU.
Good For
- Database Interaction: Automating the generation of SQL queries from user input.
- Developer Tools: Integrating natural language interfaces for SQL databases.
- Data Analysis: Assisting users in querying data without deep SQL knowledge.