Model Overview
lianghsun/Llama-3.2-Taiwan-3B-Instruct is a 3.2 billion parameter instruction-tuned language model developed by Huang Liang Hsun. It is built upon the lianghsun/Llama-3.2-Taiwan-3B foundation model and has undergone extensive instruction fine-tuning and multi-round Direct Preference Optimization (DPO). The primary goal of this model is to provide conversational capabilities with a focus on Traditional Chinese knowledge and style specific to Taiwan, utilizing a context length of 32768 tokens.
Key Capabilities
- Taiwan-centric Traditional Chinese Dialogue: Optimized for generating responses reflecting Taiwanese knowledge and conversational style.
- Multilingual Support: Trained with both Traditional Chinese and other multilingual conversational datasets.
- Instruction Following: Enhanced through instruction fine-tuning and DPO for better adherence to user prompts.
- Small Language Model (SLM): Offers a compact 3.2B parameter size, making it efficient for deployment.
Good For
- Direct Deployment: Ready for use in inference endpoints for Traditional Chinese dialogue generation.
- Domain-Specific Fine-tuning: Can be further fine-tuned to enhance performance and expertise in specific domains, particularly those requiring Taiwanese context.
- Applications requiring localized Traditional Chinese interaction.
Limitations
The model may exhibit biases or inaccuracies due to training data diversity. Users are advised to verify generated content for accuracy and neutrality. Initial evaluations on tmmlu++ and tw-legal-benchmark-v1 show scores below passing, indicating a need for more specialized domain data to improve performance in those areas.