NBAmine/mistral-nemo-text-to-sql

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 10, 2026Architecture:Transformer Cold

NBAmine/mistral-nemo-text-to-sql is a 12.2 billion parameter Mistral-Nemo model, fine-tuned by NBAmine, specifically for high-performance Text-to-SQL generation. It excels at converting natural language questions into SQL queries, leveraging a two-phase curriculum learning strategy for syntactic and logical alignment. This model is optimized for generating standalone SQL queries compatible with standard SQL engines, achieving 69.5% execution accuracy on the Spider validation set.

Loading preview...

Mistral-Nemo-12B-Text-to-SQL Overview

This model, developed by NBAmine, is a 12.2 billion parameter Mistral-Nemo variant specifically fine-tuned for Text-to-SQL generation. It converts natural language questions into executable SQL queries, providing DDL context. The model is a full-precision (BF16) merged version, representing the peak performance before quantization.

Key Capabilities

  • Natural Language to SQL Generation: Translates complex natural language queries into standard SQL.
  • DDL Context Understanding: Utilizes database schema (DDL) to generate accurate and contextually relevant SQL.
  • Curriculum Learning: Trained using a two-phase strategy, first focusing on SQL syntax and basic schema mapping, then advancing to complex reasoning tasks like multiple JOIN operations and nested subqueries.
  • High Accuracy: Achieves 69.5% Execution Accuracy (EX) on the challenging Spider validation set.

Training and Architecture

The model was fine-tuned using QLoRA (Rank 16, Alpha 32) with 4-bit NF4 quantization during training. It leverages the standard Mistral-Nemo 12B architecture, featuring 40 layers and Grouped Query Attention (GQA) with 8 KV heads. The maximum context length supported is 2048 tokens.

Good For

  • Applications requiring robust and accurate conversion of natural language into SQL queries.
  • Developers looking for a high-performance Text-to-SQL model as a "Source of Truth" for further optimizations.