Model Overview

Llama-3.1-8B-Italian-SAVA-instruct is an 8 billion parameter instruction-tuned model from the Llama-3.1-8B-Adapted collection, developed by SapienzaNLP, ISTI-CNR, and ILC-CNR. Its vocabulary is inherited from Minerva-3B, enhancing its Italian language capabilities. The model is built upon an optimized transformer architecture and is designed for generative tasks.

Training and Adaptation

The model underwent continual training on a custom dataset derived from CulturaX, with a 4:1 ratio favoring Italian over English data (9B Italian tokens, 3B English tokens). Instruction tuning (SFT) was performed over two epochs using a diverse set of datasets, including Italian and multilingual resources like TÜLU-v3, LIMA, WildChat-IT, TowerBlocks-v0.2, GPT-4o-ITA-Instruct, and Aya.

Performance and Use Cases

Evaluated on ITA-Bench, Llama-3.1-SAVA achieved competitive scores, including 56.9 on MMLU (5-shots) and 62.3 on IFEval (inst_level). This model is particularly well-suited for applications requiring robust Italian language generation and understanding, such as chatbots, content creation, and language-specific instruction following in Italian. Its specialized training makes it a strong candidate for Italian-centric NLP tasks.

Overview

Model Overview

Training and Adaptation

Performance and Use Cases

Full Model Card (README)