mesolitica/Malaysian-Llama-3.2-1B-Instruct-v0.1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Oct 15, 2024Architecture:Transformer Warm

The mesolitica/Malaysian-Llama-3.2-1B-Instruct-v0.1 is a 1 billion parameter instruction-tuned causal language model developed by mesolitica, based on the Llama-3.2-1B architecture. It features an extended 128k context length and is specifically fine-tuned on a 1.5 billion token Malaysian instruction dataset. This model excels in understanding and generating responses in various Malaysian dialects and languages, including Mandarin and Tamil, and is capable of multi-turn conversations related to Malaysian context, legislation, politics, religions, and role-playing.

Loading preview...

Malaysian Llama 3.2 1B Instruct v0.1

This model, developed by mesolitica, is an instruction-tuned variant of the Llama-3.2-1B architecture, featuring 1 billion parameters. It has been extensively fine-tuned on a highly curated 1.5 billion token Malaysian instruction dataset, significantly enhancing its capabilities for the Malaysian linguistic and cultural context.

Key Capabilities

  • Extended Context Length: Boasts an impressive 128k context window, allowing for processing longer and more complex inputs.
  • Multilingual and Dialectal Support: Capable of responding and coding in a wide array of languages and dialects relevant to Malaysia, including Mandarin, Tamil, Jawi, and various regional Malay dialects (Johor, Kedah, Kelantan, Pahang, Perak, Sabah, Sarawak, Selangor, Negeri Sembilan, Terengganu).
  • Malaysian Contextual Understanding: Excels in multi-turn conversations on topics specific to Malaysia, such as legislation, politics, religions, and local languages.
  • Role-Playing: Supports Malaysian-specific role-playing scenarios.
  • Standard RAG: Integrated with standard Retrieval Augmented Generation (RAG) capabilities.

Performance

On the MalayMMLU benchmark, the model achieved an average accuracy of 46.13%, with specific category accuracies including 46.33% for STEM, 41.18% for Language, 46.86% for Social Science, 48.30% for Others, and 49.89% for Humanities.

Good for

  • Applications requiring deep understanding and generation in Malaysian languages and dialects.
  • Chatbots or virtual assistants tailored for the Malaysian market.
  • Content generation and coding tasks with a focus on Malaysian linguistic nuances.
  • Educational tools or platforms needing to address Malaysian specific topics and contexts.