MohamedQiqa/legal-documents-ocr-parser-1.0

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:May 10, 2026Architecture:Transformer Cold

MohamedQiqa/legal-documents-ocr-parser-1.0 is a 4.3 billion parameter multimodal vision-language model fine-tuned from Google's Gemma-3-4B-IT. It specializes in structured metadata extraction from scanned Arabic legal documents, including government regulations and official correspondence. The model outputs comprehensive JSON objects containing classified document metadata, such as document type, issuing authority, and official marks. It is optimized for automating the cataloging and indexing of Arabic legal archives.

Loading preview...

Overview

MohamedQiqa/legal-documents-ocr-parser-1.0 is a specialized multimodal vision-language model, fine-tuned from Google's Gemma-3-4B-IT, designed for structured metadata extraction from scanned Arabic legal documents. This 4.3 billion parameter model processes document images and outputs a comprehensive JSON object containing classified metadata, including document type, issuing authority, physical properties, official seals, signatures, and routing information. The model was fine-tuned using LoRA (Low-Rank Adaptation) with QLoRA for memory efficiency during training, and the adapter was subsequently merged into the base model, providing full-precision weights for direct inference.

Key Capabilities

  • Structured Metadata Extraction: Extracts specific fields like document classification, source, physical properties, official marks, signatures, and routing information.
  • Arabic Legal Document Specialization: Optimized for government regulations, ministerial correspondence, and institutional records in Arabic.
  • Multimodal Processing: Takes image inputs of scanned documents and generates text-based JSON outputs.
  • Full Merged Model: The released model contains full-precision merged weights, eliminating the need for separate adapter loading or quantization during inference.

Good for

  • Document Digitization Pipelines: Automating the cataloging and indexing of scanned Arabic legal archives.
  • Legal Document Triage: Rapidly classifying document types and identifying issuing authorities.
  • Metadata Auto-population: Integrating into Document Management Systems (DMS) for automatic field population.
  • RAG Pipelines: Serving as a structured extraction layer for retrieval-augmented generation systems dealing with Arabic legal texts.