teknium/OpenHermes-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 14, 2023License:mitArchitecture:Transformer0.0K Open Weights Cold

OpenHermes-7B by Teknium is a 7 billion parameter instruction-tuned language model, fine-tuned on 242,000 entries of primarily GPT-4 generated data. It uniquely utilizes sample packing during training to enhance efficiency, making it suitable for general instruction-following tasks. The model's training dataset is fully open-source, distinguishing it from other models in its class.

Loading preview...

OpenHermes-7B: An Open-Source Instruction-Tuned Model

OpenHermes-7B is a 7 billion parameter language model developed by Teknium, notable for being the first fine-tune of the Hermes dataset with a fully open-source training dataset. This model was trained on 242,000 entries of high-quality, primarily GPT-4 generated data, sourced from various open datasets including GPTeacher, WizardLM, Airoboros GPT-4, Camel-AI, CodeAlpaca, GPT4-LLM, and Unnatural Instructions.

Key Training Innovations

A significant feature of OpenHermes-7B's training is the use of sample packing. This technique substantially speeds up the training process, especially when the average token length of the dataset is not close to the maximum sequence length, optimizing resource utilization.

Performance Benchmarks

OpenHermes-7B has been evaluated across several benchmarks, demonstrating its general instruction-following capabilities:

  • GPT-4All Benchmark Set: Achieved an average score of 0.679.
  • BigBench: Scored an average of 0.3367 across various tasks.
  • AGI Eval: Recorded an average of 0.2683.
  • TruthfulQA: Achieved an mc2 score of 0.4542.

Good for

  • General instruction-following: Excels in tasks requiring understanding and execution of diverse instructions.
  • Research and development: Its fully open-source dataset makes it ideal for researchers and developers looking to understand and build upon its training data.
  • Efficient fine-tuning: The sample packing methodology suggests potential for efficient further fine-tuning on custom datasets.