Apokalyptikon/tei-entity-linker-qwen3-14b-mlx

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Apr 1, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Apokalyptikon/tei-entity-linker-qwen3-14b-mlx is a 14 billion parameter LoRA fine-tuned adapter for the Qwen3-14B base model, specifically designed for authority file linking in historical TEI editions. It excels at disambiguating named entities (persons, places, organizations) from historical registers by matching them against Wikidata/GND candidates, handling historical spelling variants and complex biographical data. This model provides a crucial verification step for digital scholarly editing workflows, outputting structured JSON verdicts on entity matches.

Loading preview...

TEI Entity Linker: Qwen3-14B LoRA Adapter

This model is a 14 billion parameter LoRA adapter for the Qwen3-14B base model, fine-tuned by Apokalyptikon to address a critical challenge in digital scholarly editing: linking named entities from historical TEI registers to authority files like Wikidata and GND. It processes entities (persons, places, organizations) and a list of potential candidates, outputting a verdict (MATCH, PARTIAL, NONE) with confidence and a brief reason in JSON format.

Key Capabilities

  • Historical Entity Disambiguation: Accurately links entities by analyzing biographical data, descriptions, and context.
  • Handles Historical Spelling: Recognizes and matches historical variants (e.g., Creveld → Krefeld, Coeln → Köln).
  • Complex Entity Recognition: Distinguishes mythological figures from literary works, identifies ethnic groups, allegories, and personifications.
  • Structured Output: Provides deterministic JSON output with verdicts, confidence scores, and reasons, suitable for automated pipelines.
  • Optimized for TEI Data: Trained on real-world historical TEI editions (early modern German correspondence, classical philology registers).

Training and Performance

The model was fine-tuned using LoRA on Apple Silicon via MLX, leveraging a teacher-student approach where Claude Sonnet 4 generated high-quality labeled training data. It was trained on 7,098 examples, focusing on single-entity prompts for optimal performance. The model demonstrates strong capabilities in:

  • Biographical disambiguation: Correctly distinguishing individuals with the same name based on life dates.
  • Geographic disambiguation: Differentiating locations with similar names (e.g., Dillingen an der Donau vs. Dillingen/Saar).
  • Robust Rejection: Effectively returns NONE for underspecified or non-matching entries.

Good For

  • Digital Humanities Projects: Specifically designed for researchers and developers working with historical TEI editions.
  • Automated Entity Linking: Integrating into pipelines for verifying and enriching named entities in historical texts.
  • Authority File Integration: Bridging historical documents with modern authority databases like Wikidata and GND.

This adapter requires an Apple Silicon Mac for MLX inference and approximately 10 GB RAM for the 4-bit model.