Overview

The Taiwan-LLM-7B-v2.0-base is a 7 billion parameter GPT-like model, developed by Yen-Ting Lin and Yun-Nung Chen, and fine-tuned from meta-llama/Llama-2-7b-hf. Its primary focus is on Traditional Chinese, incorporating diverse Taiwanese textual sources and refined through Supervised Fine-Tuning. This approach aims to align the model closely with Taiwan's unique cultural and linguistic nuances.

Key Capabilities

Traditional Chinese Proficiency: Specifically designed and optimized for the Traditional Chinese language, making it highly relevant for Taiwanese contexts.
Cultural Alignment: Enriched with Taiwanese textual sources to enhance cultural relevance and contextual comprehension.
Improved Performance: Demonstrates improved performance on benchmarks such as TC-Eval, indicating strong language understanding and generation capabilities within its target domain.
Base Model for Fine-tuning: Intended to be further fine-tuned for specific instruction-following or chat applications.

Training Details

The model was trained using a learning rate of 5e-05, an Adam optimizer with betas=(0.9, 0.999) and epsilon=1e-08, and a cosine learning rate scheduler with a warmup ratio of 0.03 over 5 epochs. The development of Taiwan LLM v2 was conducted in collaboration with Ubitus K.K., who provided compute resources.

Good For

Applications requiring deep understanding and generation of Traditional Chinese.
Developing chatbots or instruction-following models tailored for the Taiwanese cultural context.
Research into culturally and linguistically specific large language models.