NousResearch/DeepHermes-3-Llama-3-8B-Preview

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 12, 2025License:llama3Architecture:Transformer0.4K Warm

NousResearch/DeepHermes-3-Llama-3-8B-Preview is an 8 billion parameter language model developed by Nous Research, built upon the Llama-3.1 architecture with a 32768 token context length. This model uniquely unifies both traditional "intuitive" LLM responses and long chain-of-thought reasoning capabilities, which can be toggled via a system prompt. It is designed for advanced reasoning tasks, improved LLM annotation, judgment, and function calling, making it suitable for applications requiring deliberate problem-solving and structured outputs.

Loading preview...

DeepHermes 3 - Llama-3.1 8B Preview

DeepHermes 3 Preview, developed by Nous Research, is an 8 billion parameter model based on the Llama-3.1 architecture, featuring a 32768 token context length. It represents the latest iteration in the Hermes series, focusing on user alignment and powerful steering capabilities.

Key Capabilities & Differentiators

  • Unified Reasoning and Intuitive Modes: DeepHermes 3 is one of the first models to integrate both standard LLM responses and long chain-of-thought reasoning into a single model. This "deep thinking" mode is activated by a specific system prompt, allowing the model to deliberate extensively before providing a solution.
  • Enhanced Agentic Capabilities: Building on its predecessor, Hermes 3, this model offers improvements in agentic functions, roleplaying, multi-turn conversation, and long-context coherence.
  • Improved Annotation, Judgment, and Function Calling: The model demonstrates advancements in LLM annotation, judgment, and robust function calling capabilities, supporting structured outputs and tool use.
  • Llama-Chat Format: Utilizes the Llama-Chat format for prompt structuring, enabling steerability through system prompts for guiding roles, rules, and stylistic choices.
  • Structured Outputs (JSON Mode): Supports generating responses strictly adhering to a provided JSON schema, facilitating integration into applications requiring structured data.

Benchmarks

Benchmarks are provided for both "Reasoning Mode" (evaluated using HuggingFace's open-r1 reasoning mode evaluation suite) and "Non-Reasoning Mode" (evaluated against Llama-3.1-8B-Instruct using LM-Eval-Harness Benchmark Suite). The reasoning mode shows significant gains in tasks like MATH Hard compared to previous Hermes versions.

Use Cases

DeepHermes 3 Preview is particularly well-suited for applications requiring:

  • Complex Problem Solving: Leveraging its deep thinking mode for tasks that benefit from extensive internal deliberation and systematic reasoning.
  • Agentic Workflows: Where advanced control, roleplaying, and multi-turn conversational coherence are crucial.
  • Function Calling and Tool Use: For integrating external tools and APIs, with specific prompt formats for function signature and tool call parsing.
  • Structured Data Generation: When outputs need to conform to a precise JSON schema.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p