swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA
LLaMAntino-2-70b-hf-UltraChat-ITA is a 69 billion parameter instruction-tuned LLaMA 2 model developed by Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, and Giovanni Semeraro, specifically adapted for the Italian language. Fine-tuned using QLoRa on an Italian translation of the UltraChat dataset, this model is designed to enhance performance in Italian dialogue use cases. It leverages a 32K context length and aims to provide an improved resource for Italian NLP research.
Loading preview...
Overview
LLaMAntino-2-70b-hf-UltraChat-ITA is a 69 billion parameter instruction-tuned Large Language Model (LLM) based on LLaMA 2, specifically adapted for the Italian language. Developed by a team including Pierpaolo Basile and Marco Polignano, and funded by the PNRR project FAIR, this model aims to significantly improve performance for Italian dialogue applications.
Key Capabilities & Training
- Italian Language Focus: This model is an Italian-adapted version of LLaMA 2 - 70B, making it highly suitable for Italian NLP tasks.
- Instruction-Tuned: It has been instruction-tuned using QLoRa on the UltraChat dataset, which was translated into Italian using Argos Translate.
- Prompt Format: Utilizes a LLaMA 2-based prompt template adapted for Italian, ensuring optimal performance when followed during inference.
- Compute Infrastructure: Trained on the Leonardo supercomputer, indicating robust computational resources were used.
Performance
Evaluation metrics show competitive performance on Italian benchmarks:
- hellaswag_it acc_norm: 0.6566
- arc_it acc_norm: 0.5004
- m_mmlu_it 5-shot acc: 0.6084
- Average: 0.588
For a detailed comparison, users can refer to the Leaderboard for Italian Language Models.
When to Use This Model
- Italian Dialogue Systems: Ideal for applications requiring high-quality conversational AI in Italian.
- Italian NLP Research: A valuable resource for researchers working on language models and natural language processing specific to the Italian language.
- Fine-tuning for Italian Tasks: Provides a strong base model for further fine-tuning on specialized Italian datasets.