Name: CMU-AIRe/TARS-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CMU-AIRe

TARS-1.5B: An Adaptive Reasoning Model for LLM Safety

CMU-AIRe/TARS-1.5B is a 1.5 billion parameter open-source reasoning model, built upon the Qwen2.5-1.5B-Instruct base model, with a 32,768 token context length. Developed by CMU-AIRe, this model is specifically engineered to advance research in LLM safety through its novel TARS (Training Adaptive Reasoners for Safety) methodology. TARS is an online reinforcement learning (RL) approach that trains models to exhibit adaptive reasoning, leading to both low refusal rates and safer behavior.

Key Capabilities & Training

The TARS training method, which involves a 50/50 mix of harmful and harmless prompts, incorporates three core ingredients:

Lightweight Supervised Fine-Tuning (SFT): Enables the model to generate diverse responses.
Harmless Prompt Mixing: Integrates harmless prompts during the RL training phase to balance safety and utility.
Decoupled Reward Model: Utilizes a separate reward model to facilitate better exploration during the learning process.

Use Cases

This model is primarily intended for:

Research in LLM Safety: Provides a specialized tool for exploring and developing safer AI systems.
Adaptive Reasoning Studies: Ideal for investigating how models can adaptively reason to avoid harmful outputs while maintaining helpfulness.

For comprehensive details on the TARS methodology, refer to the associated paper and blogpost.

Overview

TARS-1.5B: An Adaptive Reasoning Model for LLM Safety

Key Capabilities & Training

Use Cases

Full Model Card (README)