mlabonne/Daredevil-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 6, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Daredevil-7B is a 7 billion parameter language model developed by mlabonne, created by merging three Mistral-7B-based models using LazyMergekit. This model demonstrates strong performance across various benchmarks, including AGIEval, GPT4All, and TruthfulQA, making it suitable for general-purpose reasoning and question-answering tasks. With a 4096-token context length, it offers a balanced solution for applications requiring robust language understanding and generation.

Loading preview...

Overview

Daredevil-7B is a 7 billion parameter language model developed by mlabonne, constructed through a merge of three distinct Mistral-7B-based models: SamirGPT-v1, Slerp-CM-mist-dpo, and Mistral-7B-Merge-14-v0.2. This merge was performed using LazyMergekit with the dare_ties method, aiming to combine the strengths of its constituent models.

Key Capabilities

  • Strong Benchmark Performance: Daredevil-7B shows competitive results on various evaluation benchmarks. On the Nous benchmark, it achieved an average score of 58.22, outperforming OpenHermes-2.5-Mistral-7B and NeuralHermes-2.5-Mistral-7B. Notably, it scored 44.85 on AGIEval, 76.07 on GPT4All, and 64.89 on TruthfulQA.
  • Open LLM Leaderboard: Preliminary evaluations on the Open LLM Leaderboard indicate an average score of 73.36, with specific scores including 69.37 on AI2 Reasoning Challenge, 87.17 on HellaSwag, and 65.30 on MMLU.
  • Mistral-7B Base: Inherits the efficient architecture and capabilities of the Mistral-7B base model, providing a solid foundation for diverse NLP tasks.

Good For

  • General-purpose AI applications: Its balanced performance across multiple benchmarks suggests suitability for a wide range of tasks, including question answering, text generation, and reasoning.
  • Research and experimentation: As a merged model, it offers a valuable platform for exploring the effectiveness of model merging techniques and their impact on performance.