anakin87/Llama-3-8b-ita-ties

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 17, 2024License:llama3Architecture:Transformer0.0K Warm

anakin87/Llama-3-8b-ita-ties is an 8 billion parameter language model based on the Llama-3 architecture, created by anakin87 through a TIES merge of two prominent Italian LLMs. This model is specifically designed and optimized for Italian language tasks, aiming to enhance performance in understanding and generating Italian text. It leverages the strengths of its merged components to provide a capable solution for applications requiring strong Italian language proficiency.

Loading preview...

Overview

anakin87/Llama-3-8b-ita-ties is an 8 billion parameter language model built upon the Meta-Llama-3-8B base. It was developed by anakin87 using the TIES (Trimmed, Iterative, and Selective) merging method to combine two leading Italian language models: swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA and DeepMount00/Llama-3-8b-Ita. The primary goal of this merge was to create a model with enhanced capabilities for the Italian language.

Key Characteristics

  • Italian Language Focus: Specifically optimized for processing and generating Italian text.
  • Merge Method: Utilizes the TIES merging technique, which selectively combines parameters from multiple models.
  • Base Model: Built on the robust Meta-Llama-3-8B architecture.
  • Parameter Count: Features 8 billion parameters, offering a balance between performance and computational efficiency.

Performance

While the creator notes that the merge aimed to improve upon existing models, the results are described as "acceptable" without surpassing the best existing models. Performance metrics are available on the Leaderboard for Italian Language Models. The model achieved the following normalized accuracy scores:

  • hellaswag_it: 0.6621
  • arc_it: 0.5535
  • m_mmlu_it 5-shot: 0.5749
  • Average: 0.5968

Use Cases

This model is suitable for applications requiring a strong understanding and generation of Italian language, such as:

  • Italian text generation and summarization.
  • Chatbots and conversational AI in Italian.
  • Language understanding tasks specific to Italian.
  • Research and development in Italian NLP.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p