swap-uniba/LLaMAntino-2-chat-13b-hf-UltraChat-ITA

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Dec 16, 2023License:llama2Architecture:Transformer0.0K Open Weights Warm

LLaMAntino-2-chat-13b-hf-UltraChat-ITA is a 13 billion parameter instruction-tuned Large Language Model developed by swap-uniba, adapted from LLaMA 2 chat. This model is specifically fine-tuned for Italian dialogue use cases, leveraging the UltraChat dataset translated into Italian. It aims to provide improved performance for NLP researchers working with Italian language applications, built on a 4096 token context length.

Loading preview...

LLaMAntino-2-chat-13b-hf-UltraChat-ITA Overview

LLaMAntino-2-chat-13b-hf-UltraChat-ITA is a 13 billion parameter instruction-tuned Large Language Model (LLM) developed by Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, and Giovanni Semeraro at swap-uniba. It is an Italian-adapted version of LLaMA 2 chat, specifically fine-tuned to enhance performance in Italian dialogue scenarios.

Key Capabilities & Training

  • Italian Dialogue Optimization: The model is specifically designed to improve Italian NLP research and dialogue applications.
  • Instruction-Tuned: It has been instruction-tuned using QLoRa on the UltraChat dataset, which was translated into Italian using Argos Translate.
  • LLaMA 2 Base: Built upon the LLaMA 2 chat architecture, ensuring a robust foundation.
  • Prompt Format: Utilizes a LLaMA 2-adapted Italian prompt template for optimal inference results.

Use Cases

  • Italian NLP Research: Ideal for researchers focusing on Italian language processing.
  • Dialogue Systems: Suitable for developing chatbots and conversational AI agents in Italian.
  • Text Generation: Can be used for generating coherent and contextually relevant Italian text in a conversational style.

This model was developed with funding from the PNRR project FAIR - Future AI Research and utilized the Leonardo supercomputer for its compute infrastructure. The associated training code is available in the swapUniba/LLaMAntino repository.