Goekdeniz-Guelmez/J.O.S.I.E.3-Beta4-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

J.O.S.I.E.3-Beta4-slerp is a 7 billion parameter language model created by Goekdeniz-Guelmez, formed by merging Weyaxi/Einstein-v4-7B and cognitivecomputations/dolphin-2.8-experiment26-7b using a slerp method. This model demonstrates a general performance with an overall accuracy of 0.6395 on various benchmarks, including a 0.8409 on HellaSwag. It is designed for general-purpose language generation and understanding tasks, leveraging the combined strengths of its constituent models.

Loading preview...

J.O.S.I.E.3-Beta4-slerp Overview

J.O.S.I.E.3-Beta4-slerp is a 7 billion parameter language model developed by Goekdeniz-Guelmez. It is a product of merging two distinct models, Weyaxi/Einstein-v4-7B and cognitivecomputations/dolphin-2.8-experiment26-7b, utilizing the slerp (spherical linear interpolation) merge method via LazyMergekit. This approach aims to combine the strengths of both base models to achieve improved performance.

Key Capabilities & Performance

This model exhibits a balanced performance across a range of benchmarks, with an overall accuracy (acc) of 0.6395. Notable scores include:

  • HellaSwag: 0.8409 acc_norm
  • ARC Challenge: 0.6356 acc_norm
  • High School Psychology: 0.8440 acc_norm
  • High School Government and Politics: 0.8963 acc_norm
  • TruthfulQA: 0.5593 mc2

The model's configuration highlights a specific parameter weighting for self-attention and MLP layers during the slerp merge, indicating a deliberate strategy to blend the characteristics of its parent models.

Good For

  • General-purpose text generation and understanding tasks.
  • Applications requiring a model with a broad knowledge base, as indicated by its performance across various MMLU subjects.
  • Experimentation with merged models, particularly those created using the slerp method.