Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Nov 24, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B is a 7 billion parameter language model created by Weyaxi, formed by merging teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-1 using a ties merge. This model leverages the strengths of its base models, offering a 4096-token context length and achieving an average score of 67.84 on the Open LLM Leaderboard, making it suitable for general conversational AI and instruction-following tasks.

Loading preview...

Overview

Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B is a 7 billion parameter language model developed by Weyaxi. It is a merge of two prominent models: teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-1, utilizing a ties merge approach. This combination aims to integrate the distinct capabilities of its constituent models.

Key Capabilities

  • Instruction Following: Inherits strong instruction-following abilities from its base models.
  • General Conversational AI: Designed for broad conversational applications.
  • Performance: Achieves a competitive average score of 67.84 on the Open LLM Leaderboard, with specific scores including:
    • ARC (25-shot): 66.55
    • HellaSwag (10-shot): 84.47
    • MMLU (5-shot): 63.34
    • TruthfulQA (0-shot): 61.22
    • Winogrande (5-shot): 78.37
    • GSM8K (5-shot): 53.07

Good For

  • Chatbot Development: Ideal for creating responsive and coherent conversational agents.
  • Instruction-based Tasks: Suitable for applications requiring the model to follow specific commands or prompts.
  • Research and Experimentation: Provides a robust base for further fine-tuning or architectural exploration, leveraging the combined strengths of its merged components.

Prompt Templates

The model supports multiple prompt templates, with ChatML (from OpenHermes-2.5-Mistral-7B) being recommended, alongside the template from neural-chat-7b-v3-1.

Quantized Versions

Optimized, quantized versions are available from TheBloke in GPTQ, GGUF, and AWQ formats for efficient deployment.