Overview

Mnemosyne-3B is a 3 billion parameter QLoRA fine-tune of the Qwen2.5-Coder-3B-Instruct model, developed by Zain Asad. Its primary function is to translate natural language questions into SQL queries, with a strong specialization in schemas related to laboratory, scientific, food safety, water quality, and environmental microbiology databases. The model is designed for efficient, low-latency local or server-side inference and is available in bf16 precision and GGUF formats (Q4_K_M, Q8_0) for use with tools like llama.cpp, Ollama, and LM Studio.

Key Capabilities

Domain-Specific Text-to-SQL: Excels at generating SQL for complex laboratory and scientific database schemas, showing a +48% execution accuracy (EX) improvement on a custom lab domain benchmark compared to its base model.
General-Purpose SQL Generation: Can handle general text-to-SQL tasks, though with a modest regression on cross-domain benchmarks like Spider (-7.8% EX).
Low-Latency Inference: Optimized for local deployment and quick response times.
Schema-Aware: Requires a database schema (DDL) to be provided at inference time for accurate query generation.

Intended Use Cases

Laboratory Information Management Systems (LIMS): Ideal for generating queries within LIMS, food and water testing, and scientific data management applications.
Developer Tooling & Data Analyst Assistants: Useful for creating tools that assist developers and data analysts in writing SQL queries.
Schema-Aware Chatbots: Can power chatbots that interact with databases by converting natural language into SQL.

Limitations

Performance on general SQL tasks is slightly lower than the base model.
Requires an explicit schema in the prompt; has no inherent knowledge of databases.
Limited by a 2048-token context length, which may truncate very long schemas.
Not suited for safety-critical automated execution without human review due to potential for semantically incorrect queries.

Overview

Overview

Key Capabilities

Intended Use Cases

Limitations

Full Model Card (README)