galatolo/cerbero-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 26, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

galatolo/cerbero-7b is a 7 billion parameter, fully fine-tuned Italian Large Language Model built upon the Mistral-7B architecture. It is specifically designed to excel in understanding and generating Italian text, outperforming other Italian LLMs like Fauno and Camoscio on benchmarks such as SQuAD-it and EVALITA tasks. This model is optimized for a wide range of Italian language AI applications, including question answering, toxicity detection, irony detection, and sentiment analysis.

Loading preview...

cerbero-7b: A Dedicated Italian LLM

cerbero-7b is a 7 billion parameter, fully fine-tuned Italian Large Language Model developed by galatolo. It is built on the robust Mistral-7B base model and trained on the specialized Cerbero Dataset, which was generated using an innovative dynamic self-chat method. A notable variant, cerbero-7b-openchat, is also available, based on openchat3.5, and claims performance on par with or superior to ChatGPT 3.5.

Key Capabilities & Performance

  • Superior Italian Language Proficiency: Significantly outperforms other Italian LLMs like Fauno and Camoscio on Italian-specific benchmarks.
  • Question Answering: Achieves 72.55% F1 score and 55.6% Exact Match on SQuAD-it.
  • EVALITA Benchmarks: Demonstrates strong performance in Toxicity Detection (63.04% F1), Irony Detection (48.51% F1), and Sentiment Analysis (61.80% F1).
  • Full Fine-tuning: Unlike LORA or QLORA, cerbero-7b is fully fine-tuned, trained on an expansive Italian LLM using synthetic datasets with an 8192-token context window.
  • Permissive Licensing: Released under the Apache 2.0 license, allowing unrestricted commercial and research usage.

Use Cases & Integration

  • Italian AI Applications: Ideal for developing advanced AI solutions that require deep understanding and generation of Italian text.
  • Research and Commercial Projects: Suitable for both academic research and deployment in commercial products due to its open license.
  • Easy Integration: Compatible with Hugging Face Transformers and llama.cpp for flexible deployment, including quantized GGUF versions for resource-constrained environments.

For more technical details, refer to the research paper on arXiv.