legml-v1.0-instruct: French Instruction-Tuned LLM
legmlai/legml-v1.0-instruct is an 8 billion parameter instruction-tuned model, derived from legml-v1.0-base (Qwen-3). Developed by Mohamad Alhajar and legml.ai, this model is uniquely focused on the French language, having been fine-tuned on the Open-Hermes-FR dataset.
Key Capabilities & Features
- 100% Francophone Alignment: Fine-tuned on 799,875 instruction/response pairs exclusively in French, ensuring high-quality French language understanding and generation.
- Dataset Origin: Open-Hermes-FR was created by translating the original OpenHermes dataset into French using GPT-4o, followed by response generation and automatic filtering.
- Training Methodology: Utilizes Supervised Fine-Tuning (SFT) with multi-turn conversations and a light application of Direct Preference Optimization (DPO).
- Purpose: Designed to provide a coherent and rich foundation for aligning French-speaking Large Language Models, excelling in dialogue, reasoning, and question-answering within a French context.
Limitations
- Knowledge Cut-off: Limited knowledge beyond April 2025.
- Mathematical Reasoning: Competitive mathematical reasoning is still under development.
- Bias: Potential biases inherited from source datasets and GPT-4o may persist.
This model is particularly well-suited for applications requiring precise and contextually appropriate French language interactions.