Goekdeniz-Guelmez/J.O.S.I.E.3-Beta3-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

J.O.S.I.E.3-Beta3-slerp is a 7 billion parameter language model developed by Goekdeniz-Guelmez, created by merging Locutusque/Hercules-3.1-Mistral-7B and cognitivecomputations/dolphin-2.8-experiment26-7b using a slerp merge method. This model demonstrates strong performance across various benchmarks, including an overall accuracy of 64.32% and a Hellaswag acc_norm of 84.56%. While it performs well on evaluations, the developer notes that its conversational quality is not yet optimal and plans further training on datasets like Dolphin.

Loading preview...

J.O.S.I.E.3-Beta3-slerp: A Merged 7B Language Model

J.O.S.I.E.3-Beta3-slerp is a 7 billion parameter language model developed by Goekdeniz-Guelmez. It was created through a slerp merge of two base models: Locutusque/Hercules-3.1-Mistral-7B and cognitivecomputations/dolphin-2.8-experiment26-7b. This merging strategy aims to combine the strengths of its constituent models.

Key Capabilities and Performance

This model exhibits solid performance across a range of benchmarks, as indicated by its evaluation results. Notable scores include:

  • Overall Accuracy (all acc): 64.32%
  • Hellaswag (acc_norm): 84.56%
  • Winogrande (acc): 80.42%
  • GSM8K (acc): 58.60%

The model also shows strong results in various MMLU categories, such as high school government and politics (89.63%) and marketing (87.60%).

Current Status and Future Development

While J.O.S.I.E.3-Beta3-slerp performs well on evaluation benchmarks, the developer notes that its conversational quality is not yet ideal and it is not uncensored. Future training efforts are planned, specifically utilizing datasets like Dolphin, to enhance its interactive capabilities and overall chat performance.