s3nh/NousHermes-Kunoichi-SolarMaid-7b

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jan 5, 2024License:openrailArchitecture:Transformer0.0K Open Weights Cold

s3nh/NousHermes-Kunoichi-SolarMaid-7b is an experimental 8 billion parameter language model, created by s3nh, that merges three distinct models: SanjiWatsuki/Kunoichi-7B, Undi95/SolarMaid-v0.1.1, and NousResearch/Nous-Hermes-llama-2-7b. This model utilizes the SLERP merge method, specifically configured to blend self-attention and MLP layers from its constituent models. It is designed to combine the strengths of its base models, offering a unique blend of capabilities for various natural language processing tasks.

Loading preview...

Model Overview

s3nh/NousHermes-Kunoichi-SolarMaid-7b is an experimental 8 billion parameter language model, developed by s3nh, that combines the characteristics of three distinct base models. This model is a product of a sophisticated merging process using mergekit, specifically employing the SLERP (Spherical Linear Interpolation) merge method.

Merged Components

This model integrates the following pre-trained language models:

  • SanjiWatsuki/Kunoichi-7B
  • Undi95/SolarMaid-v0.1.1
  • NousResearch/Nous-Hermes-llama-2-7b

Merge Configuration

The SLERP merge was executed with a specific configuration to blend the layers of the constituent models. The self_attn (self-attention) layers were interpolated with values ranging from 0.0 to 1.0, while the mlp (multi-layer perceptron) layers were interpolated with values ranging from 0.0 to 1.0, but in an inverse distribution compared to self_attn. This precise layer-wise blending aims to create a model that leverages the unique strengths of each merged component.

Current Status

It is important to note that this model is currently in an experimental phase and may not function optimally. Users should be aware that it is under development and subject to updates.