mesolitica/malaysian-llama2-13b-32k-instructions

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The mesolitica/malaysian-llama2-13b-32k-instructions model is a 13 billion parameter Llama2-based instruction-tuned language model developed by mesolitica. It is fine-tuned using QLORA on a translated UltraChat dataset, specifically designed for chat completions in Malaysian. This model leverages a 32k context length and the Llama2 chat template, making it suitable for conversational AI applications requiring Malaysian language understanding and generation.

Loading preview...

Overview

The mesolitica/malaysian-llama2-13b-32k-instructions model is a 13 billion parameter instruction-tuned variant of the Llama2 architecture, developed by mesolitica. It has been fine-tuned using QLORA on a Malaysian-translated version of the UltraChat dataset, specifically mesolitica/google-translate-ultrachat. This model is designed for chat completions and adheres to the exact Llama2 chat template for its conversational structure.

Key Capabilities

  • Malaysian Language Proficiency: Optimized for understanding and generating responses in Malaysian, making it highly relevant for local applications.
  • Instruction Following: Fine-tuned to follow instructions effectively in a conversational context.
  • Extended Context Window: Utilizes a 32k context length, allowing for more extensive and coherent dialogues.
  • QLORA Fine-tuning: Employs QLORA for efficient fine-tuning, enabling deployment with 4-bit quantization using BitsAndBytesConfig.
  • Flash Attention 2: Supports Flash Attention 2 for potentially faster inference.

Good For

  • Building chatbots and conversational AI agents that interact in Malaysian.
  • Applications requiring long-context understanding for Malaysian dialogues.
  • Research and development in Malaysian natural language processing.