MoxoffSrL/Volare

Cold
Public
8.5B
FP8
8192
License: mit
Hugging Face
Overview

Overview

Volare is an 8.5 billion parameter language model developed by MoxoffSrL, based on the Gemma-7B architecture. It has been fine-tuned using SFT and LoRA adjustments on a combination of publicly available datasets, such as SQUAD-it, and proprietary in-house datasets. This model is specifically optimized for the Italian language.

Key Capabilities

  • Contextual Understanding: Designed to effectively understand and maintain context within conversations and documents.
  • Retrieval Augmented Generation (RAG): Ideal for applications that require generating responses based on retrieved information, thanks to its strong contextual awareness.
  • Italian Language Proficiency: Fine-tuned with Italian datasets to enhance performance in Italian-specific tasks.

Performance

Volare's performance was evaluated using the same test sets as the Open Ita LLM Leaderboard. It achieved an average score of 0.555 and an F1 score of 69.82 across various Italian benchmarks, including hellaswag_it, arc_it, and m_mmlu_it.

Limitations

As Volare has not undergone RLHF for safety alignment, it may produce problematic outputs. The exact composition of the base model's training corpus is unknown, but it likely included a mix of web data, technical sources, and code.