mlabonne/ChimeraLlama-3-8B-v2
mlabonne/ChimeraLlama-3-8B-v2 is an 8 billion parameter language model based on the Llama 3 architecture, created by mlabonne through a merge of six distinct Llama 3-based models using LazyMergekit. This model integrates various instruction-tuned and DPO-optimized Llama 3 variants to enhance general performance and instruction following. It is designed for broad applications requiring a capable 8B parameter model, demonstrating an average score of 19.99 on the Open LLM Leaderboard evaluation metrics.
Loading preview...
ChimeraLlama-3-8B-v2: A Merged Llama 3 Model
ChimeraLlama-3-8B-v2 is an 8 billion parameter language model developed by mlabonne. It is a product of merging six different Llama 3-based models, including instruction-tuned and DPO-optimized variants, using the LazyMergekit tool. This merging strategy aims to combine the strengths of its constituent models, such as NousResearch/Meta-Llama-3-8B-Instruct, mlabonne/OrpoLlama-3-8B, and cognitivecomputations/dolphin-2.9-llama3-8b, among others.
Key Capabilities & Performance
The model's performance is evaluated on the Open LLM Leaderboard, achieving an average score of 19.99. Specific benchmark results include:
- IFEval (0-Shot): 44.69
- BBH (3-Shot): 28.48
- MMLU-PRO (5-shot): 28.54
These metrics indicate its proficiency across various reasoning and knowledge-based tasks. The model leverages a context length of 8192 tokens, making it suitable for processing moderately long inputs.
Good For
- General-purpose instruction following tasks.
- Applications benefiting from a blend of different Llama 3 fine-tunes.
- Scenarios requiring a capable 8B parameter model with a balanced performance profile.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.