MatthieuJ/ING_2003M3_SLERP

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 20, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

MatthieuJ/ING_2003M3_SLERP is a 7 billion parameter language model created by MatthieuJ, formed by merging chihoonlee10/T3Q-DPO-Mistral-7B and MatthieuJ/ING_2003M2_SLERP using the SLERP method. This model leverages the strengths of its constituent models, specifically combining a DPO-tuned Mistral variant with another merged model. It is designed for general language tasks, benefiting from the combined knowledge and fine-tuning of its merged components.

Loading preview...

Model Overview

MatthieuJ/ING_2003M3_SLERP is a 7 billion parameter language model developed by MatthieuJ. This model is a product of a merge operation using mergekit, combining two distinct base models:

  • chihoonlee10/T3Q-DPO-Mistral-7B: A Mistral-7B variant that has undergone DPO (Direct Preference Optimization) fine-tuning.
  • MatthieuJ/ING_2003M2_SLERP: Another merged model, indicating an iterative merging approach.

Merge Configuration

The model was created using the SLERP (Spherical Linear Interpolation) merge method. This technique allows for a weighted combination of the parameters from the source models. The configuration specifies distinct interpolation values (t) for different parts of the neural network:

  • Self-attention layers: Interpolation values range from 0 to 1, with specific values like 0.5, 0.3, and 0.7 applied across different layers.
  • MLP (Multi-Layer Perceptron) layers: Interpolation values are also varied, including 1, 0.5, 0.7, 0.3, and 0.
  • General parameters: A default interpolation value of 0.5 is applied where not specifically overridden.

Key Characteristics

  • Parameter Count: 7 billion parameters.
  • Context Length: 4096 tokens.
  • Architecture: Based on the Mistral family, inheriting its efficient architecture.
  • Training Method: Leverages the benefits of DPO from one of its base models, suggesting improved instruction following and preference alignment.

Potential Use Cases

Given its merged nature and DPO-tuned component, ING_2003M3_SLERP is likely suitable for a variety of general-purpose language generation and understanding tasks, particularly those benefiting from instruction-following capabilities.