taide/Llama3-TAIDE-LX-8B-Chat-Alpha1
Llama3-TAIDE-LX-8B-Chat-Alpha1 is an 8 billion parameter language model developed by TAIDE, based on Meta's LLaMA3-8b architecture. It is continuously pre-trained with 43 billion Traditional Chinese tokens and instruction-tuned to enhance performance in office tasks and multi-turn conversations. This model excels in Traditional Chinese language processing, particularly for tasks like writing, summarization, translation, and general chat, with a context length of 8K tokens.
Loading preview...
Overview
Llama3-TAIDE-LX-8B-Chat-Alpha1 is an 8 billion parameter model developed by the TAIDE project, building upon Meta's LLaMA3-8b. The TAIDE project focuses on creating generative AI models tailored for Taiwan's language and culture, aiming to reduce reliance on foreign technologies and promote trusted AI development.
Key Capabilities
- Enhanced Traditional Chinese Processing: Continuously pre-trained with 43 billion Traditional Chinese tokens, significantly improving its ability to respond in Traditional Chinese and handle specific tasks relevant to the region.
- Office Task Optimization: Specifically fine-tuned for common office tasks such as automatic summarization, email writing, article generation, and Chinese-English/English-Chinese translation.
- Cultural and Linguistic Nuance: Strengthened with knowledge of local Taiwanese culture, terminology, and national conditions.
- Multi-turn Conversation: Capable of engaging in multi-turn conversational dialogues, making it suitable for chat and task assistance scenarios.
- Data Trustworthiness: Emphasizes strict control over training data to enhance the trustworthiness and applicability of generated content.
Performance
The model was evaluated using taide-bench across tasks like Chinese-English translation, English-Chinese translation, summarization, article writing, and letter writing. It achieved an average score of 8.620, performing comparably to GPT-3.5 (8.676) and outperforming other LLaMA2 models in these specific Traditional Chinese-centric benchmarks.
Training Details
Llama3-TAIDE-LX-8B-Chat-Alpha1 underwent continuous pre-training on a diverse dataset of approximately 140GB of Traditional Chinese texts, including legal documents, news articles, legislative gazettes, academic papers, and various government and cultural resources. This was followed by instruction tuning using 128K single-turn and multi-turn dialogue examples generated by TAIDE's Llama2 series models, covering world knowledge, creative writing, common sense, translation, summarization, programming, and "Taiwanese values."
Good for
- Applications requiring high proficiency in Traditional Chinese.
- Automating office tasks like document summarization, content creation, and translation.
- Chatbots and conversational AI systems designed for Taiwanese users or contexts.
- Generating content that reflects local Taiwanese culture and linguistic nuances.