zhichen/Llama3-Chinese

Warm
Public
8B
FP8
8192
Apr 21, 2024
Hugging Face
Overview

Llama3-Chinese: Enhanced Bilingual LLM

Llama3-Chinese is an 8 billion parameter large language model developed by Zhichen Zhang, Xin LU, and Long Chen. It is built upon the robust Meta-Llama-3-8B architecture, leveraging advanced fine-tuning techniques like DORA and LORA+ to significantly improve its performance, particularly in Chinese language contexts.

Key Capabilities

  • Bilingual Proficiency: Fine-tuned on a substantial dataset comprising 500k high-quality Chinese multi-turn SFT data and 100k English multi-turn SFT data, enabling strong performance in both languages.
  • Advanced Training Methods: Utilizes DORA and LORA+ training methodologies for efficient and effective adaptation of the base model.
  • Self-Cognition Data: Incorporates 2k single-turn self-cognition data, potentially enhancing its understanding of its own capabilities and limitations.
  • Open-Source Base: Benefits from the strong foundation of Meta-Llama-3-8B, providing a powerful and widely recognized base model.

Good For

  • Chinese Language Applications: Ideal for tasks requiring nuanced understanding and generation in Chinese, such as chatbots, content creation, and translation.
  • Bilingual AI Systems: Suitable for use cases that demand seamless switching and interaction between Chinese and English.
  • Research and Development: Provides a strong base for further research into large language models, especially those focused on East Asian languages.