LoganResearch/ARC-Base-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 17, 2026License:cc-by-4.0Architecture:Transformer0.0K Open Weights Cold

LoganResearch/ARC-Base-8B is an 8 billion parameter Llama 3.1-based language model developed by Logan Research, featuring a 32,768 token context length. It integrates an Adaptive Repetition Controller (ARC) system that uses Contrastive Fiber Heads-on-Thought (CF-HoT) to detect and suppress undesirable behavioral patterns like repetition, verbosity, and hedging at decode-time. This model is optimized for generating more concise and information-dense responses with minimal latency overhead, making it suitable for applications requiring direct and efficient communication.

Loading preview...

ARC-Base-8B: Adaptive Repetition Controller

LoganResearch/ARC-Base-8B is an 8 billion parameter language model built on the Hermes-3-Llama-3.1-8B architecture, designed to address common behavioral patterns observed in RLHF-aligned models such as verbosity, hedging, and repetition. Developed by Logan Matthew Napolitano of Logan Research, this model introduces a novel decode-time intervention system called Adaptive Repetition Controller (ARC).

Key Capabilities

  • Decode-Time Behavioral Intervention: ARC employs lightweight prediction heads (~5,300 parameters each) called Contrastive Fiber Heads-on-Thought (CF-HoT) to detect and suppress undesirable output patterns in real-time.
  • Repetition Suppression: Achieves a 91% reduction in repetition instances with a remarkable 125x class separation for repetition detection, leading to more concise outputs.
  • Increased Information Density: Improves information density by an estimated 38% by reducing filler phrases and unnecessary elaboration.
  • Minimal Latency Overhead: The entire ARC system adds less than 1% latency overhead (approximately 0.22ms) during inference, making it practical for production environments.
  • Targeted Interventions: Detects and intervenes on patterns like hedging (1.5x separation) and verbosity (2.1x separation), in addition to repetition.
  • Open-Source Release: Provides the complete model, detection heads, and inference code for full transparency and research.

Good For

  • Applications requiring concise and direct language generation.
  • Reducing common LLM artifacts like repetitive phrases, excessive hedging, and verbose responses.
  • Scenarios where information density and efficiency are prioritized over conversational fluff.
  • Researchers interested in decode-time behavioral steering and mechanistic interpretability of LLMs.