Thimira/sinhala-llama-2-7b-chat-hf

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 1, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

Thimira/sinhala-llama-2-7b-chat-hf is a 7 billion parameter Llama 2-based causal language model fine-tuned by Thimira for Sinhala language text generation. This model specializes in assistant-like chat in Sinhala, building upon the NousResearch/Llama-2-7b-chat-hf base model. It is specifically designed for Sinhala language applications, though its current capabilities are limited and require further data and fine-tuning.

Loading preview...

Overview

Thimira/sinhala-llama-2-7b-chat-hf is a 7 billion parameter language model, fine-tuned from the NousResearch/Llama-2-7b-chat-hf base model. Its primary purpose is Sinhala language text generation, specifically for assistant-like chat interactions. The model was trained on the Thimira/sinhala-llm-dataset-llama-prompt-format dataset.

Key Capabilities

  • Sinhala Language Generation: Designed to produce text in the Sinhala language.
  • Assistant-like Chat: Intended for conversational applications in Sinhala.
  • Llama 2 Prompt Format Adherence: Requires the LLaMA 2 prompt format (including [INST], [/INST], <<SYS>> tags, BOS/EOS tokens, and specific whitespace) for optimal performance.

Limitations and Training Details

Currently, the model's capabilities are described as extremely limited, indicating a need for additional data and fine-tuning to enhance its utility. It was trained for 2 epochs with a learning rate of 0.0002, using an Adam optimizer. The training utilized PEFT 0.10.0, Transformers 4.40.2, Pytorch 2.1.0, Datasets 2.19.1, and Tokenizers 0.19.1.