NousResearch/Hermes-2-Theta-Llama-3-8B

Warm
Public
8B
FP8
8192
May 5, 2024
License: apache-2.0
Hugging Face
Overview

Model Overview

Hermes-2 \u0398 (Theta) is an experimental 8 billion parameter language model developed by Nous Research in collaboration with Arcee. It is a merged and subsequently RLHF'ed version of the Hermes 2 Pro model and Meta's Llama-3 Instruct model, aiming to combine the best features of both. The model utilizes the ChatML prompt format, enabling structured multi-turn chat dialogues and system prompt steerability, similar to OpenAI's API.

Key Capabilities

  • Advanced Function Calling: Hermes-2 \u0398 is specifically trained for function calling, allowing it to generate structured tool calls based on user queries and provided function signatures. A dedicated GitHub repository provides code for building these function calling templates.
  • Structured JSON Output: The model supports a dedicated JSON mode, where it can adhere to a given JSON schema and respond with only a JSON object, making it suitable for applications requiring precise data formatting.
  • ChatML Format: Employs the ChatML format for robust multi-turn conversations, including system prompts for guiding model behavior and style.

Benchmarks

Performance metrics include an average of 72.59 on GPT4All, 44.05 on AGIEval, and 44.13 on BigBench. It also achieved an IFEval score of 72.64 and an MT_Bench average of 8.196875.

Good For

  • Developers building applications that require reliable function calling to interact with external tools and APIs.
  • Use cases demanding structured JSON outputs conforming to specific schemas.
  • Applications benefiting from ChatML-based conversational AI with strong steerability through system prompts.