Merak-7B-v3: Indonesian Language LLM
Merak-7B-v3 is a 7 billion parameter Large Language Model developed by Ichsan2895, specifically designed and fine-tuned for the Indonesian language. It is built upon the robust Meta Llama-2-7B-Chat-HF architecture.
Key Capabilities & Features
- Indonesian Language Focus: Primarily trained and optimized using a substantial corpus of Indonesian Wikipedia articles (200k in v1, 600k in v2) and further fine-tuned with datasets like Ichsan2895/OASST_Top1_Indonesian and Ichsan2895/alpaca-gpt4-indonesian.
- Efficient Deployment: Utilizes QLoRA (Quantized LoRA) for efficient fine-tuning, enabling the model to run effectively on systems with as little as 16 GB VRAM, with a 4-bit quantization option for even lower VRAM usage (>= 10 GB).
- Open Licensing: Distributed under the Creative Commons-By Attribution-Share Alike-Non Commercial (CC-BY-SA-NC 4.0) license, promoting accessibility for AI enthusiasts and researchers.
Use Cases
Merak-7B-v3 is particularly well-suited for applications requiring strong performance in the Indonesian language, such as:
- Natural Language Understanding (NLU) in Indonesian.
- Text Generation in Indonesian.
- Chatbot development for Indonesian-speaking users.
- Research and development in Indonesian NLP.