asyafiqe/Merak-7B-v3-Mini-Orca-Indo
Merak-7B-v3-Mini-Orca-Indo is a 7 billion parameter language model developed by asyafiqe, fine-tuned from Ichsan2895's Merak-7B-v3. This model is specifically instruction-tuned on a Bahasa Indonesia translated version of the Orca Mini v1 dataset, making it optimized for detailed, long-form responses in Indonesian. It supports a context length of 4096 tokens and is designed for conversational AI tasks requiring comprehensive answers in the Indonesian language.
Loading preview...
Overview
Merak-7B-v3-Mini-Orca-Indo is a 7 billion parameter language model, fine-tuned by asyafiqe from the base model Ichsan2895/Merak-7B-v3. Its primary distinction lies in its instruction-tuning on a Bahasa Indonesia translated version of the psmathur/orca_mini_v1_dataset, making it highly specialized for generating detailed and lengthy responses in Indonesian.
Key Capabilities
- Indonesian Language Specialization: Optimized for understanding and generating comprehensive text in Bahasa Indonesia.
- Instruction Following: Fine-tuned on an Orca-style dataset to excel at following instructions and producing detailed answers.
- Context Length: Supports a context window of 4096 tokens, allowing for processing and generating longer texts.
- Accessibility: Can run on GPUs with 16GB VRAM, and with BitsandBytes, it can operate on 6GB VRAM, making it accessible for various hardware setups.
Training Details
The model was instruction fine-tuned for 6 hours using LoRA, DeepSpeed ZeRO-2, and FlashAttention, implemented via Axolotl. Key hyperparameters included a learning rate of 0.0004, a batch size of 16, and a cutoff length of 4096. The training process focused on enhancing its conversational and instruction-following abilities in Indonesian.
Good For
- Applications requiring detailed and extensive text generation in Bahasa Indonesia.
- Conversational AI and chatbots designed for Indonesian-speaking users.
- Tasks that benefit from strong instruction-following capabilities in an Indonesian context.