Name: ContextualAI/archangel_dpo_llama13b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ContextualAI

Overview

ContextualAI/archangel_dpo_llama13b is a Llama-13B model developed by Contextual AI, distinguished by its optimization using the Direct Preference Optimization (DPO) loss function. This model has been aligned with human preferences through training on a combination of the SHP, Anthropic HH, and Open Assistant datasets. It is part of the Human-Centered Loss Functions (HALOs) research initiative.

Key Capabilities

Preference Alignment: Optimized with DPO loss for improved alignment with human feedback.
Conversational Formatting: Designed to be prompted using a TuluV2-consistent format, where user and assistant turns are clearly delineated with <|user|> and <|assistant|> tokens.
Conditional Generation: Supports optional control tokens, <|good|> and <|bad|>, which can be appended to prompts to guide generation towards desired attributes, leveraging additional tokens included in the tokenizer's embeddings.

Usage Notes

Models automatically add a beginning-of-sequence (BOS) token during tokenization; no end-of-sequence (EOS) token is added to the prompt.
For more technical details on the underlying research and training instructions, refer to the ContextualAI HALOs code repository and their blog post.

Citation

If you use this model or the associated research, please cite the technical report on Human-Centered Loss Functions (HALOs).

Overview

Overview

Key Capabilities

Usage Notes

Citation

Full Model Card (README)