NousResearch/Hermes-2-Pro-Llama-3-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Jun 25, 2024Architecture:Transformer0.0K Warm

NousResearch/Hermes-2-Pro-Llama-3-70B is a 70 billion parameter language model developed by Nous Research, @interstellarninja, and Fireworks.AI. It is an upgraded, retrained version of Nous Hermes 2, featuring an updated dataset and a new Function Calling and JSON Mode dataset. This model excels in general task capabilities, conversation, and specifically in reliable function calling and structured JSON outputs, scoring 90% on function calling and 84% on structured JSON output evaluations. It is optimized for agentic capabilities with special tokens for parsing while streaming.

Loading preview...

Hermes 2 Pro - Llama-3 70B Overview

Hermes 2 Pro is a 70 billion parameter model, a collaborative effort by Nous Research, @interstellarninja, and Fireworks.AI. It builds upon Nous Hermes 2 with an enhanced dataset and a newly developed Function Calling and JSON Mode dataset. The model maintains strong general task and conversational abilities while significantly improving its performance in structured outputs.

Key Capabilities

  • Function Calling: Achieves 90% on internal function calling evaluations, utilizing a specialized system prompt and multi-turn structure with new ChatML roles and dedicated tokens (<tools>, <tool_call>, <tool_response>).
  • JSON Structured Outputs: Scores 84% on structured JSON output evaluations, designed to respond with only a JSON object based on a provided schema.
  • General Task & Conversation: Retains excellent performance in broad language understanding and generation tasks.
  • Agentic Capabilities: Incorporates single tokens for agentic parsing during streaming, enhancing reliability.

Good For

  • Developers requiring robust function calling in their applications.
  • Use cases demanding reliable JSON structured outputs from an LLM.
  • General conversational AI and complex task execution where precise output formatting is critical.
  • Applications leveraging agentic workflows that benefit from specialized parsing tokens.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p