Qwen2.5-Taiwan-3B-Instruct: Optimized for Traditional Chinese
This model, developed by benchang1110, is a 3.1 billion parameter instruction-tuned causal language model built upon the powerful Qwen/Qwen2.5-3B-Instruct base. It has been specifically enhanced for Traditional Chinese (zh-tw) through a unique training pipeline, including tokenizer swapping, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).
Key Capabilities
- Superior Traditional Chinese Understanding: Achieves a TMLU score of 59.94% and a TMMLU+ score of 49.52%, outperforming other 3B models and even competing with some 8B models in Traditional Chinese benchmarks.
- Multi-turn Dialogue: Capable of engaging in coherent and logical multi-turn conversations.
- Content Generation: Excels at tasks like article writing, letter composition, and summarization.
- Translation: Proficient in translating between English and Traditional Chinese.
- Role-playing: Can adapt its persona based on system prompts for various role-playing scenarios.
- Taiwan-Specific Knowledge: Demonstrates strong understanding of Taiwan-related queries, such as local attractions.
- Extended Context Length: Supports a context length of 32,768 tokens for processing longer inputs.
Good for
- Applications requiring high-quality Traditional Chinese text generation and comprehension.
- Building chatbots and conversational AI systems for Taiwanese users.
- Summarization and translation tasks involving Traditional Chinese content.
- Developing AI assistants with specific cultural and linguistic nuances for Taiwan.
- Researchers and developers seeking a cost-effective yet powerful model for Traditional Chinese NLP, leveraging the strong base capabilities of Qwen.