hrktos-37/Hermes-4-70B-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jan 2, 2026License:llama3Architecture:Transformer0.0K Warm

hrktos-37/Hermes-4-70B-heretic is a 70 billion parameter decensored version of NousResearch's Hermes-4-70B, a Llama-3.1-70B based reasoning model with a 32768 token context length. This model is specifically modified using Heretic v1.1.0 to reduce refusals and enhance helpfulness across various scenarios. It excels in reasoning, math, code, STEM, logic, and creative writing, while offering improved steerability and schema adherence for structured outputs.

Loading preview...

hrktos-37/Hermes-4-70B-heretic: Decensored Reasoning Model

This model is a 70 billion parameter, decensored variant of NousResearch's Hermes-4-70B, built upon the Llama-3.1-70B architecture. It was created using Heretic v1.1.0 to significantly reduce refusal rates, demonstrating 26/100 refusals compared to the original model's 47/100 on the RefusalBench benchmark. The model maintains a 32768 token context length and is designed for enhanced helpfulness and user alignment.

Key Capabilities

  • Advanced Reasoning: Features a "hybrid reasoning mode" with explicit <think>…</think> segments for deliberation, improving performance in math, code, STEM, and logic.
  • Improved Steerability: Offers extreme improvements in steerability and reduced refusal rates, making it highly adaptable to user values and preferences.
  • Structured Outputs: Trained for robust schema adherence, capable of producing valid JSON and repairing malformed objects.
  • Enhanced Training: Benefits from a post-training corpus of ~5M samples / ~60B tokens, emphasizing verified reasoning traces.
  • Function Calling & Tool Use: Supports tool calls within a single assistant turn, integrating seamlessly with reasoning mode for improved accuracy.

Good for

  • Applications requiring a highly steerable and less censored large language model.
  • Tasks demanding strong reasoning capabilities, including complex problem-solving in math, code, and STEM.
  • Generating structured outputs like JSON, ensuring format-faithful responses.
  • Creative writing and subjective response generation where expressive freedom is desired.
  • Developers looking for a model with explicit internal reasoning processes for debugging or transparency.