Name: mesolitica/malaysian-llama2-7b-32k-instructions API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mesolitica

Overview

This model, mesolitica/malaysian-llama2-7b-32k-instructions, is a 7 billion parameter Llama2-based instruction-tuned model developed by Mesolitica. It leverages QLORA for fine-tuning and is specifically trained on a Malaysian-translated version of the UltraChat dataset. A key feature is its extended context window of 32,768 tokens, which allows for processing longer conversations and more complex prompts.

Key Capabilities

Malaysian Language Proficiency: Optimized for generating responses and understanding queries in the Malaysian language.
Chat Completions: Designed to follow the Llama2 chat template for effective conversational interactions.
Extended Context: Supports a 32k token context length, enabling more coherent and context-aware responses over longer dialogues.
Quantization: Utilizes 4-bit quantization (NF4) with double quantization and bfloat16 compute dtype for efficient deployment.
Flash Attention 2: Incorporates Flash Attention 2 for potentially faster inference.

Good For

Developing chatbots and conversational agents for Malaysian-speaking users.
Applications requiring long-context understanding and generation in Malaysian.
Research and development in low-resource language NLP, specifically for Malaysian.
Use cases where efficient deployment of a 7B parameter model with extended context is crucial.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)