Name: AyaEhab258/NASAQ4.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AyaEhab258

Overview

AyaEhab258/NASAQ4.1 is a 7.9 billion parameter vision-language model, fine-tuned from Gemma 4 E4B, specifically designed for Optical Character Recognition (OCR) of historical Arabic calligraphy. It focuses on transcribing text from various calligraphic styles such as Naskh, Thuluth, Diwani, Kufic, and Muhaqqaq, utilizing the HICMA dataset alongside custom collected samples.

Training Approach

The model underwent LoRA fine-tuning with an OCR-only objective, meaning it does not perform joint style classification. The training involved a two-phase process: an initial base fine-tune followed by a refinement phase to optimize performance.

Performance Metrics

On a held-out test set of 602 images, the model achieved a Character Error Rate (CER) of 20.65%, a Word Error Rate (WER) of 48.17%, and a Levenshtein Ratio of 86.22%. Performance varies by style, with Naskh and Muhaqqaq showing the lowest CERs (12.9% and 14.1% respectively), while Kufic and Diwani have higher error rates (47.4% and 51.6%) primarily due to limited training data for these specific styles.

Key Capabilities

Specialized OCR: Highly effective at transcribing historical Arabic calligraphy.
Multi-style Support: Handles Naskh, Thuluth, Diwani, Kufic, and Muhaqqaq scripts.
Image-to-Text: Processes image inputs to generate transcribed Arabic text.

When to Use This Model

This model is ideal for researchers, historians, and developers working with historical Arabic documents, manuscripts, or any visual content containing complex Arabic calligraphy that requires accurate text extraction. It is particularly strong for Naskh and Muhaqqaq styles.

Overview

Overview

Training Approach

Performance Metrics

Key Capabilities

When to Use This Model

Full Model Card (README)