swap-uniba/LLaMAntino-2-7b-hf-ITA

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 14, 2023License:llama2Architecture:Transformer0.0K Open Weights Warm

LLaMAntino-2-7b-hf-ITA is a 7 billion parameter LLaMA 2-based large language model developed by swap-uniba, specifically adapted for the Italian language. Trained using QLoRa on the clean_mc4_it medium dataset, this model provides a foundational resource for Italian natural language generation tasks. It is designed to support Italian NLP researchers with a base model for various text generation applications. The model maintains a context length of 4096 tokens and is licensed under the Llama 2 Community License.

Loading preview...

LLaMAntino-2-7b-hf-ITA: Italian-Adapted LLaMA 2 Model

LLaMAntino-2-7b-hf-ITA is a 7 billion parameter Large Language Model (LLM) developed by Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, and Giovanni Semeraro, funded by the PNRR project FAIR - Future AI Research. It is an Italian-adapted version of the LLaMA 2 architecture, fine-tuned from meta-llama/Llama-2-7b-hf.

Key Capabilities & Training

  • Italian Language Focus: Specifically adapted and trained for the Italian language, making it a valuable resource for Italian NLP research.
  • Training Methodology: Utilizes QLoRa for efficient training.
  • Training Data: Trained on the clean_mc4_it medium dataset, ensuring exposure to a broad range of Italian text.
  • Computational Resources: Training was conducted on the Leonardo supercomputer.
  • Model Type: Based on the LLaMA 2 architecture, offering a robust foundation for various tasks.

Use Cases

  • Natural Language Generation: Primarily aimed at providing Italian NLP researchers with a base model for natural language generation tasks.
  • Research & Development: Suitable for experiments and applications requiring strong Italian language understanding and generation capabilities.

Licensing & Citation

The model operates under the Llama 2 Community License. Users are requested to cite the associated arXiv paper: Basile et al., 2023, LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language if used in research.