Overview

LLaMAntino-2-70b-hf-UltraChat-ITA is a 69 billion parameter instruction-tuned Large Language Model (LLM) based on LLaMA 2, specifically adapted for the Italian language. Developed by a team including Pierpaolo Basile and Marco Polignano, and funded by the PNRR project FAIR, this model aims to significantly improve performance for Italian dialogue applications.

Key Capabilities & Training

Italian Language Focus: This model is an Italian-adapted version of LLaMA 2 - 70B, making it highly suitable for Italian NLP tasks.
Instruction-Tuned: It has been instruction-tuned using QLoRa on the UltraChat dataset, which was translated into Italian using Argos Translate.
Prompt Format: Utilizes a LLaMA 2-based prompt template adapted for Italian, ensuring optimal performance when followed during inference.
Compute Infrastructure: Trained on the Leonardo supercomputer, indicating robust computational resources were used.

Performance

Evaluation metrics show competitive performance on Italian benchmarks:

hellaswag_it acc_norm: 0.6566
arc_it acc_norm: 0.5004
m_mmlu_it 5-shot acc: 0.6084
Average: 0.588

For a detailed comparison, users can refer to the Leaderboard for Italian Language Models.

When to Use This Model

Italian Dialogue Systems: Ideal for applications requiring high-quality conversational AI in Italian.
Italian NLP Research: A valuable resource for researchers working on language models and natural language processing specific to the Italian language.
Fine-tuning for Italian Tasks: Provides a strong base model for further fine-tuning on specialized Italian datasets.

Overview

Overview

Key Capabilities & Training

Performance

When to Use This Model

Full Model Card (README)