Overview
OpenHermes-2-Mistral-7B is a 7 billion parameter language model developed by Teknium, built upon the Mistral architecture. It was fine-tuned using 900,000 entries of high-quality, primarily GPT-4 generated data, sourced from various open datasets. The training data underwent extensive filtering and conversion to the ShareGPT format, then further transformed by axolotl to utilize ChatML.
Key Capabilities
- Enhanced Instruction Following: Leverages the ChatML prompt format, including system prompts, to more strongly engage in instructions that span multiple turns.
- Strong Conversational Abilities: Demonstrated proficiency in general chat, programming discussions, recipe generation, and complex role-playing scenarios.
- Improved Performance: Outperforms previous Nous and OpenHermes models (excluding Hermes 70B) and most current Mistral fine-tunes across various benchmarks.
Benchmark Highlights
OpenHermes-2-Mistral-7B shows notable improvements over its predecessors:
- GPT4All: Achieved an average score of 72.68, a +2.68 change over Nous-Hermes 13B.
- BigBench: Scored 42.3, marking a +5.73 improvement over Nous-Hermes 13B.
- AGIEval: Reached 39.77, a +2.57 increase compared to Nous-Hermes 13B.
- TruthfulQA: Scored 50.92, a +0.54 increase over Nous-Hermes 13B.
Prompt Format
The model utilizes the ChatML format, which is compatible with OpenAI's API. This structured system allows for multi-turn chat dialogue and effective use of system instructions. Quantized versions (GPTQ, GGUF, AWQ) are available for broader deployment.