benchang1110/Qwen2.5-Taiwan-3B-Instruct
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 26, 2025License:qwen-researchArchitecture:Transformer0.0K Warm

benchang1110/Qwen2.5-Taiwan-3B-Instruct is a 3.1 billion parameter instruction-tuned causal language model developed by benchang1110, fine-tuned from Qwen/Qwen2.5-3B-Instruct. It features a 32K context length and is specifically optimized for Traditional Chinese (zh-tw) language understanding and generation through tokenizer swapping, SFT, and DPO training. This model excels in dialogue, summarization, translation, and role-playing tasks, demonstrating strong performance on Traditional Chinese benchmarks like TMLU and TMMLU+.

Loading preview...