CMU-AIRe/TARS-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Oct 24, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CMU-AIRe/TARS-7B is a 7.6 billion parameter open-source reasoning model developed by CMU-AIRe, based on Qwen2.5-7B-Instruct, with a 32K context length. It is specifically trained for safety using the TARS (Training Adaptive Reasoners for Safety) method, focusing on adaptive reasoning to achieve low refusal rates and safe behavior. This model is designed to facilitate research into reasoning models for LLM safety, particularly in handling harmful and harmless prompts.

Loading preview...

Overview of TARS-7B

CMU-AIRe/TARS-7B is a 7.6 billion parameter open-source reasoning model, built upon the Qwen2.5-7B-Instruct base. Developed by CMU-AIRe, this model is specifically engineered for safety through the application of TARS: Training Adaptive Reasoners for Safety, a method detailed in their paper, "Reasoning as an Adaptive Defense for Safety." Its primary purpose is to advance research in developing reasoning models that enhance LLM safety.

Key Capabilities and Training

TARS-7B is trained using an online reinforcement learning (RL) approach that enables it to adaptively reason for both low refusal and safe behavior. The training incorporates a balanced mix of harmful and harmless prompts (with a (\lambda = 0.5) ratio). The TARS method relies on three core ingredients:

  • Lightweight supervised fine-tuning (SFT): This component is crucial for generating diverse responses.
  • Mixing harmless prompts: Integrating harmless prompts during the RL training phase helps in broader safety generalization.
  • Decoupled reward model: This design choice facilitates better exploration during the learning process.

Use Cases

TARS-7B is particularly well-suited for:

  • Research into LLM safety: Providing a robust platform for studying and developing safer AI models.
  • Adaptive reasoning development: Exploring how models can dynamically adjust their reasoning to maintain safety.
  • Benchmarking safety mechanisms: Evaluating the effectiveness of different safety interventions in LLMs.