N0rv3ll/Hermes-4-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Apr 19, 2026License:llama3Architecture:Transformer Cold

Hermes 4 70B by Nous Research is a 70 billion parameter, Llama-3.1-based hybrid-mode reasoning model with a 32768 token context length. It is specifically trained on a large post-training corpus emphasizing verified reasoning traces, significantly improving performance in math, code, STEM, logic, and creative writing. This model excels at producing format-faithful outputs and is designed for enhanced steerability and alignment to user values.

Loading preview...

Hermes 4 70B: A Hybrid Reasoning Model

Hermes 4 70B, developed by Nous Research, is a 70 billion parameter model built upon the Llama-3.1 architecture. It features a unique "hybrid reasoning mode" that allows the model to deliberate internally using <think>…</think> segments before generating a response, enhancing its problem-solving capabilities.

Key Capabilities & Improvements

  • Enhanced Reasoning: Significant improvements in math, code, STEM, logic, and creative writing, driven by a post-training corpus of ~5M samples / ~60B tokens focused on verified reasoning traces.
  • Schema Adherence & Structured Outputs: Trained to produce valid JSON for given schemas and to repair malformed objects, making it suitable for structured data generation.
  • Steerability & Alignment: Demonstrates extreme improvements in steerability and reduced refusal rates, achieving state-of-the-art performance on RefusalBench for helpfulness and alignment without censorship.
  • Function Calling & Tool Use: Supports function/tool calls within a single assistant turn, integrating them after its internal reasoning process.

When to Use This Model

  • Complex Reasoning Tasks: Ideal for applications requiring deep thought, logical problem-solving, and accurate outputs in technical domains.
  • Structured Data Generation: Excellent for tasks needing precise JSON or other schema-compliant outputs.
  • Customizable Alignment: Suitable for use cases where specific alignment to user values and reduced refusal rates are critical.
  • Creative & Subjective Content: Capable of generating high-quality creative writing and subjective responses while maintaining general assistant quality.