LLaMAntino-2-7b-hf-ITA: Italian-Adapted LLaMA 2 Model

LLaMAntino-2-7b-hf-ITA is a 7 billion parameter Large Language Model (LLM) developed by Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, and Giovanni Semeraro, funded by the PNRR project FAIR - Future AI Research. It is an Italian-adapted version of the LLaMA 2 architecture, fine-tuned from meta-llama/Llama-2-7b-hf.

Key Capabilities & Training

Italian Language Focus: Specifically adapted and trained for the Italian language, making it a valuable resource for Italian NLP research.
Training Methodology: Utilizes QLoRa for efficient training.
Training Data: Trained on the clean_mc4_it medium dataset, ensuring exposure to a broad range of Italian text.
Computational Resources: Training was conducted on the Leonardo supercomputer.
Model Type: Based on the LLaMA 2 architecture, offering a robust foundation for various tasks.

Use Cases

Natural Language Generation: Primarily aimed at providing Italian NLP researchers with a base model for natural language generation tasks.
Research & Development: Suitable for experiments and applications requiring strong Italian language understanding and generation capabilities.

Licensing & Citation

The model operates under the Llama 2 Community License. Users are requested to cite the associated arXiv paper: Basile et al., 2023, LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language if used in research.

Overview

LLaMAntino-2-7b-hf-ITA: Italian-Adapted LLaMA 2 Model

Key Capabilities & Training

Use Cases

Licensing & Citation

Full Model Card (README)