flammenai/flammen11X-mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 24, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

flammenai/flammen11X-mistral-7B is a 7 billion parameter language model created by flammenai, merged using the SLERP method from nbeerbower/flammen11-mistral-7B and ChaoticNeutrals/Prima-LelantaclesV5-7b. This model leverages the Mistral architecture with a 4096 token context length. It is designed as a composite model, combining the strengths of its constituent models for general language generation tasks.

Loading preview...

Model Overview

flammenai/flammen11X-mistral-7B is a 7 billion parameter language model built upon the Mistral architecture, featuring a 4096 token context length. This model was created by flammenai using the mergekit tool, specifically employing the SLERP (Spherical Linear Interpolation) merge method.

Merge Details

The model is a composite of two distinct pre-trained language models:

  • nbeerbower/flammen11-mistral-7B
  • ChaoticNeutrals/Prima-LelantaclesV5-7b

The merging process involved combining all 32 layers from both base models. The configuration utilized specific t parameters for self_attn and mlp filters, indicating a nuanced blending of the models' attention and multi-layer perceptron components. The base model for the merge was nbeerbower/flammen11-mistral-7B, and the process was conducted using bfloat16 precision.

Key Characteristics

  • Composite Architecture: Benefits from the combined knowledge and capabilities of its constituent models.
  • SLERP Merge Method: A sophisticated merging technique designed to preserve and blend model strengths effectively.
  • Mistral Base: Inherits the efficiency and performance characteristics of the Mistral 7B family.

Potential Use Cases

This model is suitable for a variety of general-purpose language generation tasks, leveraging the combined expertise of its merged components. Its composite nature suggests potential for balanced performance across different domains that its base models excel in.