dwikitheduck/gemma-2-2b-id-inst
dwikitheduck/gemma-2-2b-id-inst is a 2.6 billion parameter instruction-tuned causal language model based on the Gemma-2 architecture. This model is specifically fine-tuned on a 9 million token Indonesian Alpaca dataset, making it optimized for tasks requiring understanding and generation in the Indonesian language. It features an 8192-token context length, suitable for processing longer Indonesian texts and conversations.
Loading preview...
dwikitheduck/gemma-2-2b-id-inst: Indonesian Instruction-Tuned Gemma-2
This model is an instruction-tuned variant of the Gemma-2 architecture, specifically adapted for the Indonesian language. With 2.6 billion parameters and an 8192-token context length, it is designed to handle a variety of natural language processing tasks in Indonesian.
Key Capabilities
- Indonesian Language Proficiency: Fine-tuned on a substantial 9 million token Indonesian Alpaca dataset, enhancing its understanding and generation capabilities for Indonesian text.
- Instruction Following: Optimized to follow instructions effectively, making it suitable for conversational AI, question answering, and text generation based on prompts.
- Extended Context Window: Supports an 8192-token context length, allowing for more comprehensive processing of longer documents or multi-turn dialogues in Indonesian.
Training Details
The model underwent a single epoch of training with a learning rate of 5e-5, utilizing a per-device train batch size of 1 and gradient accumulation steps of 8. The training process focused on adapting the Gemma-2 base model to specific Indonesian instruction-following patterns.
Good For
- Indonesian Chatbots and Virtual Assistants: Its instruction-following and Indonesian language focus make it ideal for building interactive agents.
- Indonesian Content Generation: Generating articles, summaries, or creative text in Indonesian.
- Research and Development in Indonesian NLP: A solid base model for further fine-tuning or experimentation with Indonesian language tasks.