mlabonne/Daredevil-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 25, 2024License:otherArchitecture:Transformer0.0K Warm

mlabonne/Daredevil-8B is an 8 billion parameter Llama 3-based mega-merge model, specifically engineered to achieve the highest MMLU (Massive Multitask Language Understanding) score among Llama 3 8B models as of May 2024. Developed by mlabonne, this model leverages a merge of nine distinct Llama 3 8B variants using LazyMergekit, focusing on maximizing general knowledge and reasoning capabilities. It is designed as an improved, censored alternative to meta-llama/Meta-Llama-3-8B-Instruct, excelling in complex understanding and factual recall tasks.

Loading preview...

Overview

mlabonne/Daredevil-8B is an 8 billion parameter model built upon the Llama 3 architecture, created by mlabonne through a mega-merge process using LazyMergekit. Its primary design goal was to maximize performance on the MMLU benchmark, making it the top-performing Llama 3 8B model in this regard as of May 2024. This model integrates nine different Llama 3 8B variants, including contributions from nbeerbower, Hastagaras, and openchat, among others.

Key Capabilities

  • High MMLU Performance: Achieves the highest MMLU score among Llama 3 8B models, indicating strong general knowledge and reasoning abilities.
  • Benchmark Excellence: Demonstrates leading performance on Nous' benchmark suite, outperforming other Llama 3 8B models in categories like AGIEval, GPT4All, TruthfulQA, and Bigbench.
  • Enhanced Llama 3 Alternative: Functions as an improved version of meta-llama/Meta-Llama-3-8B-Instruct.
  • Censored Output: Provides a censored response behavior, with an uncensored variant (mlabonne/Daredevil-8B-abliterated) also available.

Good For

  • Applications requiring strong general knowledge and reasoning.
  • Use cases where high MMLU and benchmark scores are critical.
  • Developers seeking an enhanced, instruction-tuned Llama 3 8B model for various tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p