Overview
The Taiwan-LLM-7B-v2.0-base is a 7 billion parameter GPT-like model, developed by Yen-Ting Lin and Yun-Nung Chen, and fine-tuned from meta-llama/Llama-2-7b-hf. Its primary focus is on Traditional Chinese, incorporating diverse Taiwanese textual sources and refined through Supervised Fine-Tuning. This approach aims to align the model closely with Taiwan's unique cultural and linguistic nuances.
Key Capabilities
- Traditional Chinese Proficiency: Specifically designed and optimized for the Traditional Chinese language, making it highly relevant for Taiwanese contexts.
- Cultural Alignment: Enriched with Taiwanese textual sources to enhance cultural relevance and contextual comprehension.
- Improved Performance: Demonstrates improved performance on benchmarks such as TC-Eval, indicating strong language understanding and generation capabilities within its target domain.
- Base Model for Fine-tuning: Intended to be further fine-tuned for specific instruction-following or chat applications.
Training Details
The model was trained using a learning rate of 5e-05, an Adam optimizer with betas=(0.9, 0.999) and epsilon=1e-08, and a cosine learning rate scheduler with a warmup ratio of 0.03 over 5 epochs. The development of Taiwan LLM v2 was conducted in collaboration with Ubitus K.K., who provided compute resources.
Good For
- Applications requiring deep understanding and generation of Traditional Chinese.
- Developing chatbots or instruction-following models tailored for the Taiwanese cultural context.
- Research into culturally and linguistically specific large language models.