Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 9, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp is a 7 billion parameter language model created by Weyaxi, merged using the slerp method from teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-3. This model leverages a 4096-token context length and is designed for general conversational AI tasks. It demonstrates strong performance across various benchmarks, including an average score of 71.38 on the Open LLM Leaderboard, making it suitable for diverse applications requiring robust language understanding and generation.

Loading preview...

Overview

Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp is a 7 billion parameter language model developed by Weyaxi. It was created by merging two distinct models, teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-3, using the slerp (spherical linear interpolation) method via mergekit. The base model for this merge was mistralai/Mistral-7B-v0.1.

Key Capabilities & Performance

This model is designed for general conversational AI and demonstrates competitive performance across standard benchmarks. On the Open LLM Leaderboard, it achieved an average score of 71.38. Specific benchmark results include:

  • ARC (25-shot): 68.09
  • HellaSwag (10-shot): 86.2
  • MMLU (5-shot): 64.26
  • TruthfulQA (0-shot): 62.78
  • Winogrande (5-shot): 79.16
  • GSM8K (5-shot): 67.78

Prompting and Usage

The model supports various prompt templates, with ChatML being the recommended format for optimal performance, consistent with OpenHermes-2.5-Mistral-7B. It also provides a template compatible with neural-chat-7b-v3-3.

Quantized Versions

For users requiring optimized performance and reduced memory footprint, quantized versions of this model are available through TheBloke, including GPTQ, GGUF, and AWQ formats.