Overview
Model Overview
benchang1110/Qwen2.5-Taiwan-7B-Instruct is a 7.6 billion parameter language model developed by benchang1110, building upon the powerful Qwen/Qwen2.5-7B-Instruct base model. This model is specifically optimized for Traditional Chinese (zh-tw), making it highly effective for applications requiring nuanced understanding and generation in this language.
Key Differentiators
- Tokenizer Swapping: Utilizes a unique tokenizer swapping technique to adapt the base model's Simplified Chinese tokens to Traditional Chinese, enhancing its linguistic accuracy for Taiwanese contexts.
- Instruction Tuning (SFT): Fine-tuned with lianghsun/tw-instruct-500k using LoRA, preserving the base model's strong general capabilities while specializing in Traditional Chinese instructions.
- Alignment (DPO): Further aligned using zake7749/kyara-chinese-preference-rl-dpo-s0-30K to produce structured, logical, and list-based outputs.
- High Context Length: Supports a substantial context length of 131072 tokens, enabling processing of extensive conversations and documents.
Performance
The model demonstrates strong performance on Traditional Chinese benchmarks:
- TMLU: Achieves 68.27% accuracy, outperforming other comparable models like Llama-3.2-Taiwan-3B-Instruct (36.82%) and Llama-3-Taiwan-8B-Instruct (59.50%).
- TMMLU+: Scores 58.60% accuracy, also leading its class against models such as Llama-3.2-Taiwan-3B-Instruct (31.15%) and Llama-3-Taiwan-8B-Instruct (52.28%).
Use Cases
This model excels in:
- Multi-turn conversations in Traditional Chinese.
- Complex text generation, such as article writing and formal letter composition.
- Summarization of Traditional Chinese documents.
- Role-playing scenarios with customizable system prompts.
- Taiwan-specific knowledge understanding, as demonstrated by its ability to list Taiwanese attractions.