DeepHermes 3 - Llama-3.1 8B Preview
DeepHermes 3 Preview, developed by Nous Research, is an 8 billion parameter model based on the Llama-3.1 architecture, featuring a 32768 token context length. It represents the latest iteration in the Hermes series, focusing on user alignment and powerful steering capabilities.
Key Capabilities & Differentiators
- Unified Reasoning and Intuitive Modes: DeepHermes 3 is one of the first models to integrate both standard LLM responses and long chain-of-thought reasoning into a single model. This "deep thinking" mode is activated by a specific system prompt, allowing the model to deliberate extensively before providing a solution.
- Enhanced Agentic Capabilities: Building on its predecessor, Hermes 3, this model offers improvements in agentic functions, roleplaying, multi-turn conversation, and long-context coherence.
- Improved Annotation, Judgment, and Function Calling: The model demonstrates advancements in LLM annotation, judgment, and robust function calling capabilities, supporting structured outputs and tool use.
- Llama-Chat Format: Utilizes the Llama-Chat format for prompt structuring, enabling steerability through system prompts for guiding roles, rules, and stylistic choices.
- Structured Outputs (JSON Mode): Supports generating responses strictly adhering to a provided JSON schema, facilitating integration into applications requiring structured data.
Benchmarks
Benchmarks are provided for both "Reasoning Mode" (evaluated using HuggingFace's open-r1 reasoning mode evaluation suite) and "Non-Reasoning Mode" (evaluated against Llama-3.1-8B-Instruct using LM-Eval-Harness Benchmark Suite). The reasoning mode shows significant gains in tasks like MATH Hard compared to previous Hermes versions.
Use Cases
DeepHermes 3 Preview is particularly well-suited for applications requiring:
- Complex Problem Solving: Leveraging its deep thinking mode for tasks that benefit from extensive internal deliberation and systematic reasoning.
- Agentic Workflows: Where advanced control, roleplaying, and multi-turn conversational coherence are crucial.
- Function Calling and Tool Use: For integrating external tools and APIs, with specific prompt formats for function signature and tool call parsing.
- Structured Data Generation: When outputs need to conform to a precise JSON schema.