mlabonne/Darewin-7B-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 24, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

mlabonne/Darewin-7B-v2 is a 7 billion parameter language model created by mlabonne, merged from seven different Mistral-based models using the dare_ties method. This model combines the strengths of its constituent models to offer improved general performance across various benchmarks. It is designed for diverse natural language processing tasks, leveraging a 4096-token context length.

Loading preview...

Darewin-7B-v2 Overview

Darewin-7B-v2 is a 7 billion parameter language model developed by mlabonne, created by merging seven distinct Mistral-based models using the dare_ties method. This merging technique combines models like OpenPipe/mistral-ft-optimized-1227, Intel/neural-chat-7b-v3-3, and openchat/openchat-3.5-0106, among others, to synthesize their individual strengths into a more robust and versatile model. The base model for this merge was mistralai/Mistral-7B-Instruct-v0.2.

Key Capabilities & Performance

Darewin-7B-v2 demonstrates solid general-purpose performance, as indicated by its evaluation on the Open LLM Leaderboard:

  • Average Score: 56.34
  • AI2 Reasoning Challenge (25-Shot): 62.63
  • HellaSwag (10-Shot): 78.28
  • MMLU (5-Shot): 53.01
  • TruthfulQA (0-shot): 50.99
  • Winogrande (5-shot): 73.95

While its GSM8k score (19.18) suggests limitations in complex mathematical reasoning, its strong performance in reasoning, common sense, and truthfulness benchmarks makes it suitable for a wide array of conversational and analytical tasks.

When to Use This Model

Darewin-7B-v2 is a good choice for applications requiring a balanced performance across various NLP tasks, particularly where a merged model's combined strengths are beneficial. Its 4096-token context length supports moderately long interactions. It is well-suited for general instruction following, question answering, and text generation where robust reasoning and factual recall are important.