MoxoffSrL/Volare
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8.5BQuant:FP8Ctx Length:8kPublished:Apr 15, 2024License:mitArchitecture:Transformer0.0K Open Weights Warm

MoxoffSrL/Volare is an 8.5 billion parameter language model, fine-tuned from Google's Gemma-7B architecture. It is specifically designed for Italian language tasks, leveraging both public and proprietary datasets including SQUAD-it. Volare excels at understanding and maintaining context, making it particularly well-suited for Retrieval Augmented Generation (RAG) applications and other tasks requiring strong contextual awareness.

Loading preview...

Overview

Volare is an 8.5 billion parameter language model developed by MoxoffSrL, based on the Gemma-7B architecture. It has been fine-tuned using SFT and LoRA adjustments on a combination of publicly available datasets, such as SQUAD-it, and proprietary in-house datasets. This model is specifically optimized for the Italian language.

Key Capabilities

  • Contextual Understanding: Designed to effectively understand and maintain context within conversations and documents.
  • Retrieval Augmented Generation (RAG): Ideal for applications that require generating responses based on retrieved information, thanks to its strong contextual awareness.
  • Italian Language Proficiency: Fine-tuned with Italian datasets to enhance performance in Italian-specific tasks.

Performance

Volare's performance was evaluated using the same test sets as the Open Ita LLM Leaderboard. It achieved an average score of 0.555 and an F1 score of 69.82 across various Italian benchmarks, including hellaswag_it, arc_it, and m_mmlu_it.

Limitations

As Volare has not undergone RLHF for safety alignment, it may produce problematic outputs. The exact composition of the base model's training corpus is unknown, but it likely included a mix of web data, technical sources, and code.