Name: Pclanglais/POntAvignon-4b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Pclanglais

POntAvignon-4b: Specialized French Theater Programme Annotation Model

Pclanglais/POntAvignon-4b is a 4 billion parameter model, built upon Qwen/Qwen3-4B, and fine-tuned using Pleias' Baguettotron SYNTH-syntax. Its core function is to annotate French theater programmes from the Festival d'Avignon (1947–present), transforming raw markdown into structured Linked Art JSON-LD entities. The model processes programmes with a context length of 16k tokens and achieves a 97% valid JSON rate on a held-out test set, with a token accuracy of 96.6%.

Key Capabilities

Structured Data Extraction: Extracts 7 distinct Linked Art entity types (e.g., PropositionalObject for abstract works, Activity for productions/performances, LinguisticObject for source texts).
Chain-of-Thought Reasoning: Employs <think> tags to generate dense reasoning traces, explicitly naming tasks, engaging with document structure, and resolving ontological boundaries before outputting JSON-LD.
French Theatrical Expertise: Handles French theatrical vocabulary, BnF role mapping, and historical typographic conventions.
Ontology Alignment: Targets the Linked Art Performing Arts extension (v0.9), incorporating BnF role vocabulary, deterministic content-derived IDs, and source attribution for every extracted fact.
Robust Training: Trained on 12,507 samples derived from ~1,400 Festival d'Avignon programmes (1971–2022), using a mix of Claude Sonnet and Gemma 12B backreasoning for trace generation.

Good for

Digital Humanities Research: Ideal for researchers working with historical French theater archives, particularly those from the Festival d'Avignon.
Knowledge Graph Construction: Facilitates the creation of structured knowledge graphs for performing arts by converting unstructured programme data into Linked Art JSON-LD.
Specialized NLP Tasks: Demonstrates strong performance in highly specialized information extraction tasks requiring deep domain understanding and complex reasoning.

Limitations

Specialized Scope: Primarily trained on Festival d'Avignon programmes; performance on other festivals or non-French theatrical traditions may vary.
Language Dependency: French-centric in its understanding of text, roles, and conventions.
Contextual Truncation: Large cast/crew lists might be truncated near the context limit.
Date Inference: Relies on filenames for year inference if not explicitly stated in the programme.

Overview

POntAvignon-4b: Specialized French Theater Programme Annotation Model

Key Capabilities

Good for

Limitations

Full Model Card (README)