ihalage/llama3-sinhala: Sinhala Language Instruction-Tuned LLaMA3
The ihalage/llama3-sinhala model is an 8 billion parameter instruction-tuned language model based on Meta's LLaMA3 architecture. Its primary focus is to provide enhanced understanding and generation capabilities for the Sinhala language.
Key Capabilities & Features
- Sinhala Language Proficiency: Specifically fine-tuned to process and generate text in Sinhala, offering improved performance compared to the base LLaMA3 model for this language.
- Instruction Following: Designed to follow instructions effectively in Sinhala, making it suitable for conversational AI and task-oriented applications.
- Training Data: Fine-tuned on a substantial Sinhala dataset, compiled by translating English datasets such as ELI5 and Alpaca, available on Hugging Face Datasets (
sinhala-instruction-finetune-large). - Quantization & LoRA: The original model was 4-bit quantized and fine-tuned using LoRA adapters (rank 16, scaling factor 32) with a causal language modeling objective.
Why Choose This Model?
This model is particularly valuable for developers and researchers working on applications requiring robust Sinhala language processing. It addresses the need for high-quality, instruction-following capabilities in Sinhala, where general-purpose LLMs might fall short. Its specialized fine-tuning makes it a strong candidate for chatbots, content generation, and other NLP tasks targeting the Sinhala-speaking audience.