Ichsan2895/Merak-7B-v1
Ichsan2895/Merak-7B-v1 is a 7 billion parameter large language model developed by Ichsan2895, fine-tuned from Meta's Llama-2-7B-Chat-HF. This model is specifically optimized for the Indonesian language, leveraging fine-tuning on Indonesian Wikipedia articles. It supports a 4096-token context length and can run with 16 GB VRAM using QLoRA quantization, making it suitable for Indonesian language processing tasks.
Loading preview...
Merak-7B: An Indonesian Language LLM
Merak-7B is a 7 billion parameter Large Language Model developed by Ichsan2895, specifically designed for the Indonesian language. It is built upon the Meta Llama-2-7B-Chat-HF architecture and has been fine-tuned using cleaned Indonesian Wikipedia articles to enhance its proficiency in Bahasa Indonesia.
Key Capabilities & Features
- Indonesian Language Specialization: Optimized for understanding and generating text in the Indonesian language.
- Efficient Deployment: Leverages QLoRA (Quantized LoRA) for efficient fine-tuning, enabling the model to run with as little as 16 GB of VRAM.
- Llama-2 Base: Benefits from the robust foundation of the Llama-2-7B-Chat-HF model.
- Context Length: Supports a context window of 4096 tokens.
Usage Considerations
While Merak-7B can be run with 4-bit quantization using BitsAndBytes for lower VRAM requirements (>= 10 GB), the developer notes that disabling 4-bit quantization (though requiring higher VRAM) generally yields better answer quality. The model is licensed under Creative Commons-By Attribution-Share Alike-Non Commercial (CC-BY-SA-NC 4.0).