zhichen/Llama3-Chinese
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 21, 2024Architecture:Transformer0.0K Warm

Llama3-Chinese is an 8 billion parameter language model developed by Zhichen Zhang, Xin LU, and Long Chen, based on Meta-Llama-3-8B. It is fine-tuned using DORA and LORA+ methods on 500k high-quality Chinese multi-turn SFT data and 100k English multi-turn SFT data. This model is specifically optimized for enhanced performance in Chinese language understanding and generation tasks, making it suitable for applications requiring strong bilingual capabilities.

Loading preview...

Llama3-Chinese: Enhanced Bilingual LLM

Llama3-Chinese is an 8 billion parameter large language model developed by Zhichen Zhang, Xin LU, and Long Chen. It is built upon the robust Meta-Llama-3-8B architecture, leveraging advanced fine-tuning techniques like DORA and LORA+ to significantly improve its performance, particularly in Chinese language contexts.

Key Capabilities

  • Bilingual Proficiency: Fine-tuned on a substantial dataset comprising 500k high-quality Chinese multi-turn SFT data and 100k English multi-turn SFT data, enabling strong performance in both languages.
  • Advanced Training Methods: Utilizes DORA and LORA+ training methodologies for efficient and effective adaptation of the base model.
  • Self-Cognition Data: Incorporates 2k single-turn self-cognition data, potentially enhancing its understanding of its own capabilities and limitations.
  • Open-Source Base: Benefits from the strong foundation of Meta-Llama-3-8B, providing a powerful and widely recognized base model.

Good For

  • Chinese Language Applications: Ideal for tasks requiring nuanced understanding and generation in Chinese, such as chatbots, content creation, and translation.
  • Bilingual AI Systems: Suitable for use cases that demand seamless switching and interaction between Chinese and English.
  • Research and Development: Provides a strong base for further research into large language models, especially those focused on East Asian languages.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p