mesolitica/malaysian-llama2-7b-32k-instructions
The mesolitica/malaysian-llama2-7b-32k-instructions model is a 7 billion parameter Llama2-based causal language model developed by Mesolitica. It is fine-tuned using QLORA on a translated UltraChat dataset, specifically designed for chat completions in Malaysian. This model features an extended context length of 32k tokens and utilizes the Llama2 chat template, making it suitable for conversational AI applications requiring understanding and generation in the Malaysian language.
Loading preview...
Overview
This model, mesolitica/malaysian-llama2-7b-32k-instructions, is a 7 billion parameter Llama2-based instruction-tuned model developed by Mesolitica. It leverages QLORA for fine-tuning and is specifically trained on a Malaysian-translated version of the UltraChat dataset. A key feature is its extended context window of 32,768 tokens, which allows for processing longer conversations and more complex prompts.
Key Capabilities
- Malaysian Language Proficiency: Optimized for generating responses and understanding queries in the Malaysian language.
- Chat Completions: Designed to follow the Llama2 chat template for effective conversational interactions.
- Extended Context: Supports a 32k token context length, enabling more coherent and context-aware responses over longer dialogues.
- Quantization: Utilizes 4-bit quantization (NF4) with double quantization and bfloat16 compute dtype for efficient deployment.
- Flash Attention 2: Incorporates Flash Attention 2 for potentially faster inference.
Good For
- Developing chatbots and conversational agents for Malaysian-speaking users.
- Applications requiring long-context understanding and generation in Malaysian.
- Research and development in low-resource language NLP, specifically for Malaysian.
- Use cases where efficient deployment of a 7B parameter model with extended context is crucial.