arcee-ai/Mistral-Instruct-Orca-Slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 28, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Mistral-Instruct-Orca-Slerp by arcee-ai is a 7 billion parameter language model with a 4096 token context length, created by merging Mistral-7B-Instruct-v0.2 and Mistral-7B-OpenOrca using the slerp method. This model combines the instruction-following capabilities of Mistral-Instruct with the Orca-style fine-tuning for enhanced reasoning and conversational performance. It is designed for general-purpose instruction-following tasks, leveraging the strengths of its constituent models.

Loading preview...

Model Overview

Mistral-Instruct-Orca-Slerp is a 7 billion parameter language model developed by arcee-ai, built upon the Mistral architecture. It features a 4096 token context window, making it suitable for a variety of conversational and instruction-following applications.

Key Capabilities

This model is a product of merging two prominent Mistral-based models:

  • mistralai/Mistral-7B-Instruct-v0.2: Known for its strong instruction-following and general-purpose conversational abilities.
  • Open-Orca/Mistral-7B-OpenOrca: Fine-tuned with the OpenOrca dataset, which emphasizes complex reasoning and detailed responses.

The merge was performed using the slerp (Spherical Linear Interpolation) method via mergekit, allowing for a balanced combination of the strengths from both base models. This approach aims to create a model that retains the robust instruction-following of Mistral-Instruct while integrating the enhanced reasoning capabilities derived from Orca-style training.

Use Cases

Mistral-Instruct-Orca-Slerp is well-suited for tasks requiring:

  • General instruction following: Responding accurately to a wide range of prompts.
  • Conversational AI: Engaging in coherent and contextually relevant dialogues.
  • Reasoning tasks: Benefiting from the Orca-style fine-tuning for more complex problem-solving.

This model offers a versatile option for developers seeking a 7B parameter model with a blend of strong instruction adherence and improved reasoning.