anakin87/Llama-3-8b-ita-ties
anakin87/Llama-3-8b-ita-ties is an 8 billion parameter language model based on the Llama-3 architecture, created by anakin87 through a TIES merge of two prominent Italian LLMs. This model is specifically designed and optimized for Italian language tasks, aiming to enhance performance in understanding and generating Italian text. It leverages the strengths of its merged components to provide a capable solution for applications requiring strong Italian language proficiency.
Loading preview...
Overview
anakin87/Llama-3-8b-ita-ties is an 8 billion parameter language model built upon the Meta-Llama-3-8B base. It was developed by anakin87 using the TIES (Trimmed, Iterative, and Selective) merging method to combine two leading Italian language models: swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA and DeepMount00/Llama-3-8b-Ita. The primary goal of this merge was to create a model with enhanced capabilities for the Italian language.
Key Characteristics
- Italian Language Focus: Specifically optimized for processing and generating Italian text.
- Merge Method: Utilizes the TIES merging technique, which selectively combines parameters from multiple models.
- Base Model: Built on the robust Meta-Llama-3-8B architecture.
- Parameter Count: Features 8 billion parameters, offering a balance between performance and computational efficiency.
Performance
While the creator notes that the merge aimed to improve upon existing models, the results are described as "acceptable" without surpassing the best existing models. Performance metrics are available on the Leaderboard for Italian Language Models. The model achieved the following normalized accuracy scores:
- hellaswag_it: 0.6621
- arc_it: 0.5535
- m_mmlu_it 5-shot: 0.5749
- Average: 0.5968
Use Cases
This model is suitable for applications requiring a strong understanding and generation of Italian language, such as:
- Italian text generation and summarization.
- Chatbots and conversational AI in Italian.
- Language understanding tasks specific to Italian.
- Research and development in Italian NLP.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.