anakin87/Llama-3-8b-ita-ties-pro

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 18, 2024License:llama3Architecture:Transformer0.0K Warm

anakin87/Llama-3-8b-ita-ties-pro is an 8 billion parameter language model based on the Llama 3 architecture, created by anakin87 using the TIES merge method. It combines two Italian LLMs, DeepMount00/Llama-3-8b-Ita and swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA, with Meta-Llama-3-8B-Instruct as its base. This model is specifically designed and optimized for Italian language tasks, offering a context length of 8192 tokens.

Loading preview...

Model Overview

anakin87/Llama-3-8b-ita-ties-pro is an 8 billion parameter language model developed by anakin87. It was created using the TIES merge method, combining two specialized Italian language models: DeepMount00/Llama-3-8b-Ita and swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA, with Meta-Llama-3-8B-Instruct serving as the foundational base model.

Key Characteristics

  • Architecture: Llama 3 family, 8 billion parameters.
  • Merge Method: Utilizes the TIES (Trimmed, Iterative, and Self-consistent) merging technique, which is designed to combine the strengths of multiple pre-trained models.
  • Italian Language Focus: Specifically engineered by merging models known for their performance in Italian, aiming to enhance capabilities for Italian-centric applications.
  • Context Length: Supports an 8192-token context window.

Performance Metrics

Evaluations indicate competitive performance for Italian language tasks, with an average accuracy of 0.6110 across various benchmarks. Specific scores include:

  • hellaswag_it acc_norm: 0.6967
  • arc_it acc_norm: 0.5646
  • m_mmlu_it 5-shot acc: 0.5717

For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.

Use Cases

This model is particularly suitable for applications requiring strong performance in the Italian language, such as:

  • Content generation in Italian.
  • Italian text summarization and analysis.
  • Chatbots or conversational AI systems interacting in Italian.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p