ContextualAI/archangel_sft_llama13b
ContextualAI/archangel_sft_llama13b is a 13 billion parameter language model from the Llama family, developed by Contextual AI. It is optimized using a Supervised Fine-Tuning (SFT) loss function and aligned with SHP, Anthropic HH, and Open Assistant datasets. This model is designed for instruction-following tasks, utilizing a TuluV2-consistent prompting format and supporting a 4096-token context length.
Loading preview...
Archangel SFT Llama 13B Overview
ContextualAI's archangel_sft_llama13b is a 13 billion parameter model built on the Llama architecture, specifically fine-tuned for instruction following. It leverages a Supervised Fine-Tuning (SFT) loss function and incorporates alignment data from SHP, Anthropic HH, and Open Assistant datasets to enhance its conversational capabilities.
Key Capabilities & Features
- Instruction Following: Optimized for responding to user prompts in a structured, assistant-like manner.
- TuluV2 Prompting Format: Requires a specific
<|user|>and<|assistant|>turn-based format for optimal interaction, with the human speaking first. - Context Length: Supports a context window of 4096 tokens.
- Conditional SFT: Models trained with conditional SFT include special
<|good|>and<|bad|>tokens in their tokenizers, allowing for controlled generation by appending these to prompts. - Automatic BOS Token: Automatically adds a beginning-of-sequence (BOS) token during tokenization, simplifying prompt preparation.
Alignment and Research
This model is part of Contextual AI's Human-Centered Loss Functions (HALOs) research, which focuses on improving LLM alignment. Further details on the training methodology and research can be found in their code repository and technical paper.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.