Name: declare-lab/starling-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: declare-lab

Starling-7B: A Safety-Aligned Language Model

Starling-7B, developed by declare-lab, is a 7 billion parameter model fine-tuned from Vicuna-7B with a 4096-token context length. Its primary innovation lies in its safety alignment, achieved by training on the HarmfulQA dataset. This dataset, distilled from ChatGPT using the Chain of Utterances (CoU) prompt, focuses on identifying and mitigating harmful content.

Key Capabilities & Performance

Enhanced Safety: Experimental results indicate a significant improvement in safety compared to the Vicuna baseline. Starling-7B shows an average 5.2% reduction in Attack Success Rate (ASR) on DangerousQA and HarmfulQA datasets.
Improved HHH Scores: The model demonstrates an average 3-7% improvement in HHH (Helpful, Harmless, Honest) scores on the BBH-HHH benchmark.
Reasoning & Knowledge: While primarily focused on safety, Starling-7B maintains competitive performance on general benchmarks, scoring 48.90 on TruthfulQA (MC2) and 46.69 on MMLU (5-shot), comparable to or slightly above its Vicuna base.
Red-Teaming Resource: The research also introduces the HarmfulQA dataset, comprising 1,960 harmful questions, which is a valuable resource for red-teaming and safety alignment efforts.

When to Use Starling-7B

Starling-7B is particularly well-suited for applications where robust safety and reduced generation of harmful content are critical. This includes:

Content Moderation: Filtering or flagging potentially unsafe user inputs or model outputs.
Safe AI Assistants: Developing chatbots or virtual assistants that prioritize harmless and ethical responses.
Research in AI Safety: As a baseline or tool for further exploration into safety alignment techniques and red-teaming methodologies.

Overview

Starling-7B: A Safety-Aligned Language Model

Key Capabilities & Performance

When to Use Starling-7B

Full Model Card (README)