ihalage/llama3-sinhala
The ihalage/llama3-sinhala model is an 8 billion parameter LLaMA3-based instruction-tuned causal language model developed by ihalage. It is specifically fine-tuned to understand and respond in the Sinhala language, utilizing a large Sinhala dataset translated from English sources like ELI5 and Alpaca. This model excels at generating high-quality responses in Sinhala, outperforming the original Meta LLaMA3 instruction-tuned model for Sinhala language tasks.
Loading preview...
ihalage/llama3-sinhala: Sinhala Language Instruction-Tuned LLaMA3
The ihalage/llama3-sinhala model is an 8 billion parameter instruction-tuned language model based on Meta's LLaMA3 architecture. Its primary focus is to provide enhanced understanding and generation capabilities for the Sinhala language.
Key Capabilities & Features
- Sinhala Language Proficiency: Specifically fine-tuned to process and generate text in Sinhala, offering improved performance compared to the base LLaMA3 model for this language.
- Instruction Following: Designed to follow instructions effectively in Sinhala, making it suitable for conversational AI and task-oriented applications.
- Training Data: Fine-tuned on a substantial Sinhala dataset, compiled by translating English datasets such as ELI5 and Alpaca, available on Hugging Face Datasets (
sinhala-instruction-finetune-large). - Quantization & LoRA: The original model was 4-bit quantized and fine-tuned using LoRA adapters (rank 16, scaling factor 32) with a causal language modeling objective.
Why Choose This Model?
This model is particularly valuable for developers and researchers working on applications requiring robust Sinhala language processing. It addresses the need for high-quality, instruction-following capabilities in Sinhala, where general-purpose LLMs might fall short. Its specialized fine-tuning makes it a strong candidate for chatbots, content generation, and other NLP tasks targeting the Sinhala-speaking audience.