Ichsan2895/Merak-7B-v5-PROTOTYPE1
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 10, 2023License:cc-by-nc-sa-4.0Architecture:Transformer Open Weights Cold
Merak-7B-v5-PROTOTYPE1 by Ichsan2895 is a 7 billion parameter large language model specifically fine-tuned for the Indonesian language. Based on the Mistral-7B-OpenOrca architecture, this prototype leverages QLoRA for efficient fine-tuning and DPO-Trainer for RLHF, enabling operation with 16 GB VRAM. Its primary strength lies in processing and generating content in Bahasa Indonesia, making it suitable for Indonesian-centric NLP applications.
Loading preview...
Merak-7B-v5-PROTOTYPE1: Indonesian Language LLM
Merak-7B-v5-PROTOTYPE1 is the initial prototype of a 7 billion parameter Large Language Model developed by Ichsan2895, specifically optimized for the Indonesian language. It is built upon the Mistral-7B-OpenOrca base model.
Key Capabilities & Features
- Indonesian Language Focus: Fine-tuned using cleaned Indonesian Wikipedia articles, making it highly proficient in Bahasa Indonesia.
- Efficient Fine-tuning: Utilizes QLoRA (Quantized LoRA) for efficient fine-tuning, allowing the model to run with as little as 16 GB VRAM.
- Reinforcement Learning: Incorporates DPO-Trainer with the TRL library for Reinforcement Learning from Human Feedback (RLHF).
- Open-Source Foundation: Based on the Mistral-7B-OpenOrca model, which itself is instruct-tuned on a filtered GPT-4 dataset.
Good For
- Applications requiring strong performance in the Indonesian language.
- Researchers and developers working on Indonesian NLP tasks.
- Projects with limited VRAM resources (16 GB VRAM capable).