The Malaysian-Llama-3.2-3B-Instruct model, developed by mesolitica, is a 3.2 billion parameter instruction-tuned causal language model based on Llama-3.2-3B-Instruct. It is specifically fine-tuned on a 1.5 billion token Malaysian instruction dataset, enabling it to support responses and code generation in various Malaysian dialects and languages, including Mandarin, Tamil, and Jawi. This model excels in multi-turn Malaysian contexts, covering topics like legislation, politics, religions, and local languages, demonstrating improved performance over its base model on the MalayMMLU benchmark.
Loading preview...
Malaysian Llama-3.2-3B-Instruct Overview
This model is a specialized instruction-tuned variant of the Llama-3.2-3B-Instruct, developed by mesolitica. It has been extensively fine-tuned on a highly curated 1.5 billion token Malaysian instruction dataset, enhancing its capabilities for local contexts. With 3.2 billion parameters and a 32768-token context length, it is designed to understand and generate content relevant to Malaysia.
Key Capabilities
- Multilingual and Dialectal Support: Responds and generates code in a wide array of Malaysian languages and dialects, including Mandarin, Tamil, Jawi, Manglish, Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan, and Terengganu.
- Malaysian Contextual Understanding: Excels in multi-turn conversations related to Malaysian legislation, politics, religions, and local languages.
- Improved MalayMMLU Performance: Demonstrates an average accuracy of 58.43% on the MalayMMLU benchmark (0-shot, first token accuracy), outperforming the original Llama-3.2-3B-Instruct's 56.45% average accuracy.
Good for
- Applications requiring deep understanding and generation of Malaysian-specific content.
- Developing chatbots or virtual assistants tailored for the Malaysian market.
- Code generation in various Malaysian linguistic contexts.
- Research and development in low-resource language processing, particularly for Malaysian dialects.