Name: CarperAI/stable-vicuna-13b-delta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CarperAI

StableVicuna-13B Overview

StableVicuna-13B, developed by CarperAI, is a 13 billion parameter language model built upon the Vicuna-13B v0 architecture, which itself is based on the LLaMA transformer. Its key differentiator is the application of Reinforcement Learning from Human Feedback (RLHF) using Proximal Policy Optimization (PPO) to fine-tune the model on diverse conversational and instructional datasets.

Key Capabilities

RLHF-Enhanced Conversations: Fine-tuned with PPO on datasets like OpenAssistant Conversations (OASST1), GPT4All Prompt Generations, and Alpaca, making it proficient in generating human-like conversational and instructional responses.
LLaMA Architecture: Benefits from the robust LLaMA transformer architecture, providing a strong foundation for language understanding and generation.
Customizable: Designed to be further fine-tuned by users on their specific data to improve performance for particular tasks.

Training Details

The model was trained by Duy Phung of CarperAI using the trlX library. The training involved a mix of three primary datasets for fine-tuning: OASST1, GPT4All Prompt Generations, and Alpaca. The reward model for RLHF was also trained on OASST1, Anthropic HH-RLHF, and Stanford Human Preferences Dataset, ensuring alignment with human preferences for helpfulness and harmlessness.

Intended Use

StableVicuna-13B is primarily intended for text generation, with a strong focus on conversational applications. Users can leverage its fine-tuned capabilities for various chat-based tasks or adapt it through further fine-tuning for specialized use cases, adhering to its non-commercial license.

Overview

StableVicuna-13B Overview

Key Capabilities

Training Details

Intended Use

Full Model Card (README)