Overview

index-card-extractor-4b-v0.1 is a 4.5 billion parameter open vision-language model, fine-tuned from NuExtract-3 (Qwen3.5-4B base). It specializes in extracting structured JSON data from images of historical index cards, such as catalogue, vital-record, and manuscript cards. A key feature is its ability to follow user-defined JSON schemas at inference time, even for schemas it has not encountered during training.

Key Capabilities

Schema-driven Extraction: Converts card images into structured JSON according to a provided JSON template or Pydantic schema.
Domain Adaptation: Strong domain knowledge for handwritten and typed archival cards, including French and English death records and English manuscript-catalogue cards.
Schema Generalization: Demonstrates 100% valid, schema-conforming JSON output on unseen schemas and collections.
Performance: Achieves an exact field-F1 of 0.887 on Teklia (FR handwritten deaths) and a manuscript-number F1 of 0.952 on NLS Advocates, outperforming NuExtract-3 zero-shot and, in some cases, Qwen3-VL-8B.

Intended Use

This model is designed for libraries, archives, and museums to digitize card catalogues and index drawers into structured, ingestible records. It is best used as a first-pass extractor with human review rather than for generating production-ready ground truth without verification.

Limitations

Training data for two of three collections used machine-generated silver labels, which can introduce quality ceilings for free-text fields.
Handwriting recognition remains challenging, particularly for place names and long free-text fields.
Test sets are small, so reported numbers should be considered directional.
Greedy / non-thinking decoding is recommended, as reasoning mode was not trained for this task.

Overview

Overview

Key Capabilities

Intended Use

Limitations

Full Model Card (README)