swap-uniba/LLaMAntino-2-chat-13b-hf-UltraChat-ITA
LLaMAntino-2-chat-13b-hf-UltraChat-ITA is a 13 billion parameter instruction-tuned Large Language Model developed by swap-uniba, adapted from LLaMA 2 chat. This model is specifically fine-tuned for Italian dialogue use cases, leveraging the UltraChat dataset translated into Italian. It aims to provide improved performance for NLP researchers working with Italian language applications, built on a 4096 token context length.
Loading preview...
LLaMAntino-2-chat-13b-hf-UltraChat-ITA Overview
LLaMAntino-2-chat-13b-hf-UltraChat-ITA is a 13 billion parameter instruction-tuned Large Language Model (LLM) developed by Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, and Giovanni Semeraro at swap-uniba. It is an Italian-adapted version of LLaMA 2 chat, specifically fine-tuned to enhance performance in Italian dialogue scenarios.
Key Capabilities & Training
- Italian Dialogue Optimization: The model is specifically designed to improve Italian NLP research and dialogue applications.
- Instruction-Tuned: It has been instruction-tuned using QLoRa on the UltraChat dataset, which was translated into Italian using Argos Translate.
- LLaMA 2 Base: Built upon the LLaMA 2 chat architecture, ensuring a robust foundation.
- Prompt Format: Utilizes a LLaMA 2-adapted Italian prompt template for optimal inference results.
Use Cases
- Italian NLP Research: Ideal for researchers focusing on Italian language processing.
- Dialogue Systems: Suitable for developing chatbots and conversational AI agents in Italian.
- Text Generation: Can be used for generating coherent and contextually relevant Italian text in a conversational style.
This model was developed with funding from the PNRR project FAIR - Future AI Research and utilized the Leonardo supercomputer for its compute infrastructure. The associated training code is available in the swapUniba/LLaMAntino repository.