pawlaszc/DigitalForensicsText2SQLite

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Jan 29, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

pawlaszc/DigitalForensicsText2SQLite is a fine-tuned Llama 3.2-3B model developed by pawlaszc, specialized in generating SQLite queries from natural language requests for mobile forensic databases. This 3 billion parameter model excels at converting investigative questions into executable SQL across various forensic artifacts like WhatsApp, Signal, and iMessage. It achieves 93.0% execution accuracy on a held-out test set, performing comparably to GPT-4o while running fully locally.

Loading preview...

ForensicSQL-Llama-3.2-3B: Specialized Text-to-SQL for Digital Forensics

ForSQLiteLM is a Llama 3.2-3B model, fine-tuned by pawlaszc, specifically designed to translate natural language requests into SQLite queries for mobile forensic databases. This model is integrated into the open-source forensic analysis tool FQLite.

Key Capabilities & Performance

  • High Accuracy: Achieves 93.0% execution accuracy on a 100-example held-out test set, closely matching GPT-4o's 95.0% under identical conditions, with a significant improvement of +56 percentage points over the base Llama model.
  • Local Operation: Runs entirely locally without internet connectivity, crucial for sensitive forensic investigations.
  • Broad Forensic Coverage: Generates queries for 191 forensic artifact categories, including WhatsApp, Signal, iMessage, Android SMS, iOS Health, WeChat, Instagram, and blockchain wallets.
  • Targeted Fine-tuning: Full fine-tune on the SQLiteDS dataset (800 training examples) using Hugging Face Transformers, resulting in a model size of approximately 6 GB (bf16).
  • Performance Breakdown: Matches GPT-4o on 'Easy' (95.1%) and 'Medium' (87.5%) difficulty queries, with the primary gap on 'Hard' queries (88.9%) involving complex CTEs and window functions.

Intended Use Cases

  • Mobile Forensics: Automating SQL query drafting for seized device databases.
  • Tool Integration: Designed for integration into forensic tools like FQLite, Autopsy, ALEAPP/iLEAPP.
  • Research & Education: Valuable for domain-specific Text-to-SQL research and learning forensic database analysis.

Important Considerations

  • Drafting Assistant: ForSQLiteLM is a drafting assistant, not a replacement for human SQL expertise. Approximately 1 in 14 queries may contain errors, requiring expert review and validation for critical work.
  • Scope: Specialized for SQLite databases within the forensic domain; not intended for general-purpose SQL generation or other database types.