Weyaxi/Dolphin2.1-OpenOrca-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 11, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Weyaxi/Dolphin2.1-OpenOrca-7B is a 7 billion parameter instruction-tuned language model, created by Weyaxi, resulting from a TIES merge of ehartford/dolphin-2.1-mistral-7b and Open-Orca/Mistral-7B-OpenOrca. This model leverages the strengths of both base models, offering a balanced performance across various reasoning and language understanding tasks. With a 4096-token context length, it is suitable for general-purpose conversational AI and instruction-following applications.

Loading preview...

Model Overview

Weyaxi/Dolphin2.1-OpenOrca-7B is a 7 billion parameter instruction-tuned language model developed by Weyaxi. It is a product of a TIES merge, combining the capabilities of two prominent Mistral-7B based models:

  • ehartford/dolphin-2.1-mistral-7b
  • Open-Orca/Mistral-7B-OpenOrca

This merging strategy aims to synthesize the strengths of both components, resulting in a versatile model for various natural language processing tasks.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, Dolphin2.1-OpenOrca-7B demonstrates competitive performance for its size:

  • Average Score: 60.47
  • AI2 Reasoning Challenge (25-Shot): 63.91
  • HellaSwag (10-Shot): 84.26
  • MMLU (5-Shot): 62.66
  • TruthfulQA (0-shot): 53.84
  • Winogrande (5-shot): 78.22
  • GSM8k (5-shot): 19.94

These scores indicate a strong capability in common sense reasoning, language understanding, and general knowledge, while showing room for improvement in complex mathematical reasoning (GSM8k).

Quantized Versions

For optimized deployment and inference, several quantized versions are available, thanks to TheBloke:

Ideal Use Cases

This model is well-suited for applications requiring:

  • General instruction following and conversational AI.
  • Tasks benefiting from strong common sense and language understanding.
  • Scenarios where a 7B parameter model with a 4096-token context window offers a balance between performance and computational efficiency.