Mistral-v0.1-Italian-LAPT-instruct Overview

This model is part of the Mistral-7B-v0.1-Adapted collection, a series of 7B generative models derived from Mistral-7B-Base-v0.1. Developed by SapienzaNLP, ISTI-CNR, and ILC-CNR, this specific variant has undergone continuous training and instruction tuning to enhance its capabilities, particularly for the Italian language.

Key Adaptations and Training

The model's adaptation involved training on a custom dataset skewed towards Italian, comprising 9 billion tokens from the Italian part of CulturaX and 3 billion English tokens from the same source. For instruction tuning, a diverse mix of datasets was used, including TÜLU-v3, LIMA, WildChat-IT, TowerBlocks-v0.2, GPT-4o-ITA-Instruct, and Aya, with a significant portion being Italian-centric.

Performance and Use Cases

Evaluated on ITA-Bench, Mistral-0.1-LAPT shows competitive performance in Italian language tasks. For instance, it achieves 52.9 on MMLU (5-shots) and 58.4 on Hellaswag (0-shots), outperforming the original Mistral-0.1 model in these metrics. This model is well-suited for applications requiring robust Italian language understanding and generation, such as chatbots, content creation, and translation assistance in Italian contexts.

Overview

Mistral-v0.1-Italian-LAPT-instruct Overview

Key Adaptations and Training

Performance and Use Cases

Full Model Card (README)