gqd/mistral-merge-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 6, 2024License:unlicenseArchitecture:Transformer0.0K Cold

gqd/mistral-merge-7b is a 7 billion parameter language model created by gqd, formed by linearly merging teknium/OpenHermes-2.5-Mistral-7B and Open-Orca/Mistral-7B-SlimOrca. This model leverages the strengths of its constituent Mistral-7B based models, offering a 4096-token context length. It is designed to combine the instruction-following and conversational capabilities of its merged components.

Loading preview...

Overview

gqd/mistral-merge-7b is a 7 billion parameter language model resulting from a linear merge of two prominent Mistral-7B based models: teknium/OpenHermes-2.5-Mistral-7B and Open-Orca/Mistral-7B-SlimOrca. This merge was performed using the mergekit tool, aiming to combine the distinct characteristics and strengths of its base models.

Merge Details

The model was created using the linear merge method, where both OpenHermes-2.5-Mistral-7B and Mistral-7B-SlimOrca were given equal weighting (1.0) during the merging process. This approach allows for a balanced integration of the capabilities present in each original model.

Key Characteristics

  • Architecture: Based on the Mistral-7B architecture.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.
  • Origin: A composite model, inheriting features from both OpenHermes-2.5 (known for its instruction-following and conversational abilities) and SlimOrca (optimized for performance on various benchmarks).

Potential Use Cases

This merged model is suitable for applications requiring:

  • General-purpose text generation: Leveraging the broad capabilities of its base models.
  • Instruction following: Benefiting from the instruction-tuned nature of OpenHermes-2.5.
  • Conversational AI: Drawing on the dialogue-oriented strengths of its components.