Guillaume Tell: A RAG-Optimized French LLM
Guillaume Tell is a 7 billion parameter French Large Language Model developed by Etalab (Service du Datalab) and built upon the Mistral Open-Hermes 2.5 architecture. Its core innovation lies in its optimization for Retrieval Augmented Generation (RAG), emphasizing source traceability and explicability for responses derived from French administrative documents.
Key Capabilities
- Sourced Answer Generation: Generates responses to questions by retrieving information from a provided set of sources (currently tested with five fixed sources).
- Verifiability: Designed to improve the verifiability of generated text, crucial for administrative contexts.
- French Administrative Focus: Specifically tailored to answer questions related to French administrative procedures and information.
- ChatML Prompting: Utilizes a specific ChatML-based prompt format for structured input, including source integration.
- Fine-tuned for RAG: Fine-tuned with 3880 synthetic RAG instructions and 5000 chatRAG instructions based on service-public.fr data, using LORA and 4-bit quantization.
Good for
- Public Agents: Intended for use by public officials in French administrations to facilitate administrative information retrieval.
- Verifiable Information Systems: Ideal for applications requiring responses with clear source attribution and explicability, such as the ALBERT interministerial Generative AI tool.
- French-language Administrative Q&A: Excels at providing first-level answers to administrative questions exclusively in French.