Nous-Hermes-Llama2-13b: An Instruction-Tuned Llama2 Model
Nous-Hermes-Llama2-13b is a 13 billion parameter language model developed by Nous Research, with key contributions from Teknium and Emozilla. It is fine-tuned on over 300,000 instructions, predominantly derived from high-quality synthetic GPT-4 outputs, ensuring strong performance in knowledge, task completion, and style.
Key Capabilities & Differentiators
- Extended Responses: Designed to generate longer, more comprehensive outputs.
- Reduced Hallucination: Exhibits a lower rate of factual inaccuracies compared to other models.
- Uncensored Output: Lacks OpenAI's built-in censorship mechanisms, offering more unrestricted generation.
- Consistent Dataset: Utilizes the same dataset as the original Hermes on Llama-1, ensuring a familiar yet more capable experience.
- 4096 Token Context: Supports a substantial context window for processing longer inputs and generating coherent long-form content.
Performance Highlights
This model demonstrates improved benchmark scores over its predecessor, Hermes-Llama1, across several metrics:
- GPT4All Benchmark Average: Achieves 70.0 (up from 68.8).
- BigBench Average: Scores 0.3657 (up from 0.328).
- AGIEval Average: Reaches 0.372 (up from 0.354).
It currently holds top positions on ARC-c, ARC-e, Hellaswag, and OpenBookQA, and second place on Winogrande within GPT4all's benchmarking list.
Good For
- Applications requiring detailed and extensive text generation.
- Use cases where lower hallucination rates are critical.
- Scenarios needing an instruction-following model without inherent content restrictions.
- Developers familiar with the Alpaca prompt format, which this model follows.