mlabonne/ChimeraLlama-3-8B-v3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 1, 2024License:otherArchitecture:Transformer0.0K Warm

mlabonne/ChimeraLlama-3-8B-v3 is an 8 billion parameter language model based on the Llama 3 architecture, created by mlabonne through a merge of several Llama 3-based models using LazyMergekit. This model integrates various instruction-tuned and DPO-optimized Llama 3 variants to enhance general performance. It is designed for broad applicability in conversational AI and instruction-following tasks, leveraging the strengths of its constituent models.

Loading preview...

Model Overview

mlabonne/ChimeraLlama-3-8B-v3 is an 8 billion parameter language model developed by mlabonne. It is a product of merging multiple Llama 3-based models, including instruction-tuned and DPO-optimized variants, using the LazyMergekit tool. This approach aims to combine the strengths of its constituent models to achieve improved overall performance in various natural language processing tasks.

Key Characteristics

  • Architecture: Based on the Llama 3 family, leveraging its foundational capabilities.
  • Merge Method: Utilizes the dare_ties merge method, integrating models such as NousResearch/Meta-Llama-3-8B-Instruct, mlabonne/OrpoLlama-3-8B, and cognitivecomputations/dolphin-2.9-llama3-8b, among others.
  • Context Length: Supports an 8192-token context window.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate a balanced performance across several benchmarks:

  • Average Score: 20.53
  • IFEval (0-Shot): 44.08
  • BBH (3-Shot): 27.65
  • MMLU-PRO (5-shot): 29.65

These scores suggest a model capable of handling instruction-following, common-sense reasoning, and general knowledge tasks, making it suitable for a range of applications requiring robust language understanding and generation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p