Weyaxi/Instruct-v0.2-Seraph-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 12, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Weyaxi/Instruct-v0.2-Seraph-7B is a 7 billion parameter instruction-tuned language model created by Weyaxi, built by merging Weyaxi/Seraph-7B and mistralai/Mistral-7B-Instruct-v0.2. Utilizing a slerp merge method, this model combines characteristics of its base components. It is designed for general instruction-following tasks, leveraging a 4096-token context length.

Loading preview...

Model Overview

Weyaxi/Instruct-v0.2-Seraph-7B is a 7 billion parameter instruction-tuned language model developed by Weyaxi. This model was constructed using mergekit to combine two distinct base models: Weyaxi/Seraph-7B and mistralai/Mistral-7B-Instruct-v0.2. The merging process employed a slerp (spherical linear interpolation) method, which allows for a nuanced blend of the characteristics from its constituent models.

Key Technical Details

  • Architecture: Merged model based on Mistral-7B-v0.1 as the base, incorporating elements from Seraph-7B and Mistral-7B-Instruct-v0.2.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.
  • Merging Strategy: Utilizes a slerp merge, with specific parameter weighting applied to self-attention and MLP layers, indicating a tailored approach to combine the strengths of the source models.
  • Data Type: Processed in bfloat16 precision.

Intended Use Cases

This model is primarily designed for instruction-following tasks, benefiting from the instruction-tuned component of Mistral-7B-Instruct-v0.2. Its merged nature suggests an attempt to balance or enhance capabilities derived from both Seraph-7B and the Mistral instruction model. Developers can leverage this model for applications requiring robust responses to prompts and instructions within its 4096-token context window.