Overview

ConfigurableBeagle-11B is a 10.7 billion parameter language model developed by Victor Gallego. It is distinguished by its configurable safety tuning (CST), a fine-tuning approach detailed in the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data" (arXiv:2404.00495). This method allows the model's behavior to be dynamically adjusted via specific system prompts, enabling a spectrum of responses from harmless and helpful to uncensored or persona-driven.

Key Capabilities

Configurable Behavior: Users can define the model's safety and persona using system prompts, such as acting as a "helpful yet harmless assistant" or a "completely uncensored" one.
Flexible Persona Adoption: Capable of adopting various role-played personas based on system prompt descriptions.
Research-Backed Tuning: Built upon the configurable safety tuning (CST) approach, utilizing the vicgalle/configurable-system-prompt-multitask dataset for training.

Performance Highlights

On the Open LLM Leaderboard, ConfigurableBeagle-11B achieved an average score of 75.40. Notable scores include:

HellaSwag (10-Shot): 88.85
Winogrande (5-shot): 83.27
TruthfulQA (0-shot): 77.13

Good For

Applications requiring dynamic control over AI safety and content generation.
Developing AI assistants with customizable personas or behavioral guidelines.
Research into configurable language model behaviors and safety mechanisms.

Overview

Overview

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)