InferenceIllusionist/Magic-Dolphin-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Magic-Dolphin-7b by InferenceIllusionist is a 7 billion parameter language model, a linear merge of dolphin-2.6-mistral-7b-dpo-laser, Hyperion-1.5-Mistral-7B, and merlinite-7b. This model, with a 4096 token context length, is specifically designed to enhance performance in technical topics by combining models with strong acumen in this area. It demonstrates improved benchmark performance, particularly in GSM8K, making it suitable for applications requiring strong reasoning in technical domains.

Loading preview...

Magic-Dolphin-7b Overview

Magic-Dolphin-7b is a 7 billion parameter language model developed by InferenceIllusionist, created through a linear merge of three distinct models: cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser, Locutusque/Hyperion-1.5-Mistral-7B, and ibm/merlinite-7b. The merge process involved testing various ratios, with a higher weighting given to merlinite-7b to refine its performance.

Key Capabilities & Performance

This model aims to combine the strengths of its constituent models, which showed excellent acumen in technical topics. It serves as an experiment to observe how LAB tuning is impacted by merges with models leveraging DPO. Benchmarking against its merged components, Magic-Dolphin-7b achieves an average score of 67.48 on the Open LLM Leaderboard, notably scoring 51.18 on GSM8K, which is higher than any of its individual base models. It also shows strong performance in ARC (65.78) and HellaSwag (85.61).

Merge Details

The model was created using a linear merge method. The configuration involved specific weights for each base model: dolphin-2.6-mistral-7b-dpo-laser (1.0), Hyperion-1.5-Mistral-7B (0.3), and merlinite-7b (0.5). GGUF quantizations are available for broader compatibility. The model uses the Alpaca template for instruction following.