benchang1110/Qwen2.5-Taiwan-3B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 26, 2025License:qwen-researchArchitecture:Transformer0.0K Warm

benchang1110/Qwen2.5-Taiwan-3B-Instruct is a 3.1 billion parameter instruction-tuned causal language model developed by benchang1110, fine-tuned from Qwen/Qwen2.5-3B-Instruct. It features a 32K context length and is specifically optimized for Traditional Chinese (zh-tw) language understanding and generation through tokenizer swapping, SFT, and DPO training. This model excels in dialogue, summarization, translation, and role-playing tasks, demonstrating strong performance on Traditional Chinese benchmarks like TMLU and TMMLU+.

Loading preview...

Qwen2.5-Taiwan-3B-Instruct: Optimized for Traditional Chinese

This model, developed by benchang1110, is a 3.1 billion parameter instruction-tuned causal language model built upon the powerful Qwen/Qwen2.5-3B-Instruct base. It has been specifically enhanced for Traditional Chinese (zh-tw) through a unique training pipeline, including tokenizer swapping, Supervised Fine-Tuning (SFT), and Direct Preference Optimization (DPO).

Key Capabilities

  • Superior Traditional Chinese Understanding: Achieves a TMLU score of 59.94% and a TMMLU+ score of 49.52%, outperforming other 3B models and even competing with some 8B models in Traditional Chinese benchmarks.
  • Multi-turn Dialogue: Capable of engaging in coherent and logical multi-turn conversations.
  • Content Generation: Excels at tasks like article writing, letter composition, and summarization.
  • Translation: Proficient in translating between English and Traditional Chinese.
  • Role-playing: Can adapt its persona based on system prompts for various role-playing scenarios.
  • Taiwan-Specific Knowledge: Demonstrates strong understanding of Taiwan-related queries, such as local attractions.
  • Extended Context Length: Supports a context length of 32,768 tokens for processing longer inputs.

Good for

  • Applications requiring high-quality Traditional Chinese text generation and comprehension.
  • Building chatbots and conversational AI systems for Taiwanese users.
  • Summarization and translation tasks involving Traditional Chinese content.
  • Developing AI assistants with specific cultural and linguistic nuances for Taiwan.
  • Researchers and developers seeking a cost-effective yet powerful model for Traditional Chinese NLP, leveraging the strong base capabilities of Qwen.