Thimira/sinhala-llama-2-7b-chat-hf
Thimira/sinhala-llama-2-7b-chat-hf is a 7 billion parameter Llama 2-based causal language model fine-tuned by Thimira for Sinhala language text generation. This model specializes in assistant-like chat in Sinhala, building upon the NousResearch/Llama-2-7b-chat-hf base model. It is specifically designed for Sinhala language applications, though its current capabilities are limited and require further data and fine-tuning.
Loading preview...
Overview
Thimira/sinhala-llama-2-7b-chat-hf is a 7 billion parameter language model, fine-tuned from the NousResearch/Llama-2-7b-chat-hf base model. Its primary purpose is Sinhala language text generation, specifically for assistant-like chat interactions. The model was trained on the Thimira/sinhala-llm-dataset-llama-prompt-format dataset.
Key Capabilities
- Sinhala Language Generation: Designed to produce text in the Sinhala language.
- Assistant-like Chat: Intended for conversational applications in Sinhala.
- Llama 2 Prompt Format Adherence: Requires the LLaMA 2 prompt format (including
[INST],[/INST],<<SYS>>tags, BOS/EOS tokens, and specific whitespace) for optimal performance.
Limitations and Training Details
Currently, the model's capabilities are described as extremely limited, indicating a need for additional data and fine-tuning to enhance its utility. It was trained for 2 epochs with a learning rate of 0.0002, using an Adam optimizer. The training utilized PEFT 0.10.0, Transformers 4.40.2, Pytorch 2.1.0, Datasets 2.19.1, and Tokenizers 0.19.1.