dwikitheduck/gemma-2-2b-id-inst

TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Oct 24, 2024License:gemmaArchitecture:Transformer Cold

dwikitheduck/gemma-2-2b-id-inst is a 2.6 billion parameter instruction-tuned causal language model based on the Gemma-2 architecture. This model is specifically fine-tuned on a 9 million token Indonesian Alpaca dataset, making it optimized for tasks requiring understanding and generation in the Indonesian language. It features an 8192-token context length, suitable for processing longer Indonesian texts and conversations.

Loading preview...

dwikitheduck/gemma-2-2b-id-inst: Indonesian Instruction-Tuned Gemma-2

This model is an instruction-tuned variant of the Gemma-2 architecture, specifically adapted for the Indonesian language. With 2.6 billion parameters and an 8192-token context length, it is designed to handle a variety of natural language processing tasks in Indonesian.

Key Capabilities

  • Indonesian Language Proficiency: Fine-tuned on a substantial 9 million token Indonesian Alpaca dataset, enhancing its understanding and generation capabilities for Indonesian text.
  • Instruction Following: Optimized to follow instructions effectively, making it suitable for conversational AI, question answering, and text generation based on prompts.
  • Extended Context Window: Supports an 8192-token context length, allowing for more comprehensive processing of longer documents or multi-turn dialogues in Indonesian.

Training Details

The model underwent a single epoch of training with a learning rate of 5e-5, utilizing a per-device train batch size of 1 and gradient accumulation steps of 8. The training process focused on adapting the Gemma-2 base model to specific Indonesian instruction-following patterns.

Good For

  • Indonesian Chatbots and Virtual Assistants: Its instruction-following and Indonesian language focus make it ideal for building interactive agents.
  • Indonesian Content Generation: Generating articles, summaries, or creative text in Indonesian.
  • Research and Development in Indonesian NLP: A solid base model for further fine-tuning or experimentation with Indonesian language tasks.