Name: pandaExplosion/opendata-chinese-llama2-chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: pandaExplosion

Model Overview

The opendata-chinese-llama2-chat is a 13 billion parameter conversational model developed by pandaExplosion, built upon Meta's Llama-2 base. It is part of a three-model suite, including an SFT model and a reward model, all trained on fully open-source datasets. The model is specifically optimized for Chinese language interactions.

Key Capabilities

Chinese Conversational AI: Fine-tuned for chat applications in Chinese, leveraging translated open-source datasets.
Llama-2 Architecture: Benefits from the robust Llama-2 foundation, enhanced with specific training stages.
Reinforcement Learning from Human Feedback (RLHF): Incorporates a reward model and PPO training for improved conversational quality and alignment.
Open-source Training Data: Utilizes a diverse set of open-source datasets, including over 5 million instructions for SFT and 160k ranking pairs for reward modeling.

Training Details

The model underwent a multi-stage training process using DeepSpeed-Chat:

Supervised Fine-Tuning (SFT): Trained for 2 epochs on over 5 million instructions from datasets like QingyiSi/Alpaca-CoT, with a sequence length of 4096 tokens.
Reward Model Training: Trained for 2 epochs on 160k ranking pairs from Anthropic/hh-rlhf and OpenAssistant/oasst1 (and their translated versions), using a sequence length of 2048 tokens.
PPO Stage: Trained for one epoch on 50k prompts, with a sequence length of 2048 tokens.

Performance

On the C-Eval dataset (5-shot), the opendata-chinese-llama2-chat-13B model achieved a Test Average score of 40.5, outperforming both LLaMA-2-13B (36.6) and LLaMA-2-chat-13B (37.2) in the reported evaluations.

Good For

Chinese-speaking Chatbots: Ideal for building conversational agents that interact in Chinese.
Research on RLHF for Chinese LLMs: Provides a strong baseline for further research and development in this area.
Applications requiring Llama-2 compatibility: Seamlessly integrates with existing Llama-2 ecosystems.

Overview

Model Overview

Key Capabilities

Training Details

Performance

Good For

Full Model Card (README)