SemanticAlignment/Llama-3.1-8B-Italian-SAVA-instruct
Llama-3.1-8B-Italian-SAVA-instruct is an 8 billion parameter instruction-tuned Llama-3.1-8B-Adapted model developed by SapienzaNLP, ISTI-CNR, and ILC-CNR. It features a vocabulary inherited from Minerva-3B and is continually trained on a skewed Italian and English dataset from CulturaX. This model is specifically optimized for Italian language understanding and generation, demonstrating strong performance on Italian benchmarks like ITA-Bench.
Loading preview...
Model Overview
Llama-3.1-8B-Italian-SAVA-instruct is an 8 billion parameter instruction-tuned model from the Llama-3.1-8B-Adapted collection, developed by SapienzaNLP, ISTI-CNR, and ILC-CNR. Its vocabulary is inherited from Minerva-3B, enhancing its Italian language capabilities. The model is built upon an optimized transformer architecture and is designed for generative tasks.
Training and Adaptation
The model underwent continual training on a custom dataset derived from CulturaX, with a 4:1 ratio favoring Italian over English data (9B Italian tokens, 3B English tokens). Instruction tuning (SFT) was performed over two epochs using a diverse set of datasets, including Italian and multilingual resources like TÜLU-v3, LIMA, WildChat-IT, TowerBlocks-v0.2, GPT-4o-ITA-Instruct, and Aya.
Performance and Use Cases
Evaluated on ITA-Bench, Llama-3.1-SAVA achieved competitive scores, including 56.9 on MMLU (5-shots) and 62.3 on IFEval (inst_level). This model is particularly well-suited for applications requiring robust Italian language generation and understanding, such as chatbots, content creation, and language-specific instruction following in Italian. Its specialized training makes it a strong candidate for Italian-centric NLP tasks.