Zual/MPropositioneur-V1: Atomic Proposition Extraction
MPropositioneur-V1, developed by Luc Pommeret at LISN (CNRS), is a compact language model built upon the Qwen3-0.6B architecture. Its core function is to atomize complex sentences or passages, breaking them down into a list of simple, independent, and semantically faithful atomic propositions.
Key Capabilities & Features
- Atomic Proposition Extraction: Specializes in deconstructing text into its most basic declarative units.
- Distillation Training: Optimized through distillation to achieve its specialized task.
- Multilingual Support: Trained to handle multiple languages, including French and English.
- JSON Output: Generates propositions as a JSON list of strings,
["p1", "p2", ...], for easy programmatic use. - Specific Prompt Format: Utilizes a clear prompt structure:
<|im_start|>user\nAtomize: {text}<|im_end|>\n<|im_start|>assistant\n.
Ideal Use Cases
This model is particularly well-suited for applications requiring fine-grained information processing:
- Retrieval-Augmented Generation (RAG): Improves retrieval quality by indexing atomic propositions instead of larger text chunks.
- Open Information Extraction (OpenIE): Facilitates more precise extraction of facts and relationships.
- Text Simplification and Discourse Analysis: Aids in breaking down complex information for easier understanding and analysis.