kurakurai/Luth-LFM2-350M
kurakurai/Luth-LFM2-350M is a 350 million parameter language model, fine-tuned by kurakurai in collaboration with Liquid AI, specifically to enhance its French language capabilities. Based on the LFM2-350M architecture, this model demonstrates improved performance in French instruction following, mathematical reasoning, and general knowledge, while maintaining its English proficiency. It is primarily designed for applications requiring strong French language understanding and generation in a compact model size.
Loading preview...
Luth-LFM2-350M: Enhanced French Capabilities in a Compact Model
Luth-LFM2-350M is a 350 million parameter model developed by kurakurai in collaboration with Liquid AI. It is a French fine-tuned version of the LFM2-350M base model, trained on the specialized Luth-SFT dataset. This fine-tuning process, utilizing full fine-tuning with Axolotl, successfully improved the model's French language performance across several key areas while preserving its English capabilities.
Key Capabilities and Performance
This model excels in French instruction following, mathematical tasks, and general knowledge, as evidenced by its benchmark results. Evaluation using LightEval with custom French tasks shows significant improvements over the base LFM2-350M and SmolLM2-360M-Instruct:
- French Benchmarks: Luth-LFM2-350M achieved leading scores in IFEval French (38.26), MMLU French (39.15), Math500 French (23.00), Arc-Challenge French (34.13), and Hellaswag French (43.39).
- English Benchmarks: The model maintained strong English performance, even showing slight improvements in IFEval English (57.05), GPQA-Diamond English (28.28), and Math500 English (23.20) compared to its base model.
Use Cases and Differentiators
Luth-LFM2-350M is particularly well-suited for applications requiring a small, efficient language model with strong French language understanding and generation. Its ability to handle both French and English tasks effectively makes it versatile for multilingual environments where French is a primary focus. The model's development process and evaluation scripts are openly available on GitHub, providing transparency and reproducibility.