OpenHermes-7B: An Open-Source Instruction-Tuned Model
OpenHermes-7B is a 7 billion parameter language model developed by Teknium, notable for being the first fine-tune of the Hermes dataset with a fully open-source training dataset. This model was trained on 242,000 entries of high-quality, primarily GPT-4 generated data, sourced from various open datasets including GPTeacher, WizardLM, Airoboros GPT-4, Camel-AI, CodeAlpaca, GPT4-LLM, and Unnatural Instructions.
Key Training Innovations
A significant feature of OpenHermes-7B's training is the use of sample packing. This technique substantially speeds up the training process, especially when the average token length of the dataset is not close to the maximum sequence length, optimizing resource utilization.
Performance Benchmarks
OpenHermes-7B has been evaluated across several benchmarks, demonstrating its general instruction-following capabilities:
- GPT-4All Benchmark Set: Achieved an average score of 0.679.
- BigBench: Scored an average of 0.3367 across various tasks.
- AGI Eval: Recorded an average of 0.2683.
- TruthfulQA: Achieved an mc2 score of 0.4542.
Good for
- General instruction-following: Excels in tasks requiring understanding and execution of diverse instructions.
- Research and development: Its fully open-source dataset makes it ideal for researchers and developers looking to understand and build upon its training data.
- Efficient fine-tuning: The sample packing methodology suggests potential for efficient further fine-tuning on custom datasets.