Overview
ClueAI/ChatYuan-7B is a 7 billion parameter bilingual (Chinese and English) functional dialogue language model. It is built on the LLaMA-7B architecture and has undergone a comprehensive three-stage training process to enhance its capabilities, especially in Chinese language processing and conversational tasks.
Key Training Stages
- Stage 1: Continued pre-training on 50 billion Chinese tokens using general Chinese corpora.
- Stage 2: Task-oriented instruction fine-tuning across hundreds of diverse task sets.
- Stage 3: Instruction fine-tuning utilizing millions of human feedback datasets.
Usage and Merging
Due to LLaMA model license compliance, ChatYuan-7B is released as incremental weights. Users need to merge these incremental weights with the original LLaMA-7B weights to obtain the full ChatYuan-7B model. A Python script apply_delta.py is provided for this merging process, combining a LLaMA-7B Hugging Face model with the ChatYuan-7B delta weights.
Capabilities
- Supports both Chinese and English dialogue generation.
- Demonstrates functional dialogue capabilities, as shown in examples like generating detailed responses to educational questions, continuing articles based on titles, and drafting marketing plans.
Limitations and Restrictions
- May generate factually incorrect information when asked to follow fact-related instructions.
- Can occasionally produce harmful responses due to difficulty in identifying potentially harmful instructions.
- Requires further improvement in reasoning and coding abilities.
Note: The developers restrict the use of this model and its derivatives to research purposes only, prohibiting commercial use and other potentially harmful scenarios due to existing limitations.