Overview

ConfigurableBeagle-11B is a 10.7 billion parameter language model developed by Victor Gallego. It is distinguished by its configurable safety tuning (CST), a fine-tuning approach detailed in the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data" (arXiv:2404.00495). This method allows the model's behavior to be dynamically adjusted via specific system prompts, enabling a spectrum of responses from harmless and helpful to uncensored or persona-driven.

Key Capabilities

Configurable Behavior: Users can define the model's safety and persona using system prompts, such as acting as a "helpful yet harmless assistant" or a "completely uncensored" one.
Flexible Persona Adoption: Capable of adopting various role-played personas based on system prompt descriptions.
Research-Backed Tuning: Built upon the configurable safety tuning (CST) approach, utilizing the vicgalle/configurable-system-prompt-multitask dataset for training.

Performance Highlights

On the Open LLM Leaderboard, ConfigurableBeagle-11B achieved an average score of 75.40. Notable scores include:

HellaSwag (10-Shot): 88.85
Winogrande (5-shot): 83.27
TruthfulQA (0-shot): 77.13

Good For

Applications requiring dynamic control over AI safety and content generation.
Developing AI assistants with customizable personas or behavioral guidelines.
Research into configurable language model behaviors and safety mechanisms.