mlabonne/ChimeraLlama-3-8B-v3
mlabonne/ChimeraLlama-3-8B-v3 is an 8 billion parameter language model based on the Llama 3 architecture, created by mlabonne through a merge of several Llama 3-based models using LazyMergekit. This model integrates various instruction-tuned and DPO-optimized Llama 3 variants to enhance general performance. It is designed for broad applicability in conversational AI and instruction-following tasks, leveraging the strengths of its constituent models.
Loading preview...
Model Overview
mlabonne/ChimeraLlama-3-8B-v3 is an 8 billion parameter language model developed by mlabonne. It is a product of merging multiple Llama 3-based models, including instruction-tuned and DPO-optimized variants, using the LazyMergekit tool. This approach aims to combine the strengths of its constituent models to achieve improved overall performance in various natural language processing tasks.
Key Characteristics
- Architecture: Based on the Llama 3 family, leveraging its foundational capabilities.
- Merge Method: Utilizes the
dare_tiesmerge method, integrating models such as NousResearch/Meta-Llama-3-8B-Instruct, mlabonne/OrpoLlama-3-8B, and cognitivecomputations/dolphin-2.9-llama3-8b, among others. - Context Length: Supports an 8192-token context window.
Performance Insights
Evaluations on the Open LLM Leaderboard indicate a balanced performance across several benchmarks:
- Average Score: 20.53
- IFEval (0-Shot): 44.08
- BBH (3-Shot): 27.65
- MMLU-PRO (5-shot): 29.65
These scores suggest a model capable of handling instruction-following, common-sense reasoning, and general knowledge tasks, making it suitable for a range of applications requiring robust language understanding and generation.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.