Model Overview
werty1248/Llama-3-Ko-8B-OpenOrca is an 8 billion parameter language model based on the Llama 3 architecture, specifically fine-tuned for the Korean language. It leverages the beomi/Llama-3-Open-Ko-8B as its base model and was further trained on the kyujinpy/OpenOrca-KO dataset.
Training Details
The model underwent 4 epochs of LoRA-8bit training using Axolotl, with a sequence length of 4096 and bf16 precision. The training process took approximately 6 hours on A6000x2 GPUs.
Evaluation and Performance
Evaluations were conducted using the kobest benchmark, showcasing its capabilities in Korean language understanding. Key results include:
- 0-shot Performance:
kobest_boolq: 0.5021 accuracykobest_copa: 0.6920 accuracykobest_sentineg: 0.7330 accuracy
- 5-shot Performance:
kobest_boolq: 0.7123 accuracykobest_copa: 0.7620 accuracykobest_sentineg: 0.9446 accuracy
These results indicate its proficiency in various Korean natural language understanding tasks, particularly with few-shot prompting.
Licensing
The model adheres to the Llama 3 license, which can be reviewed at https://llama.meta.com/llama3/license.
Use Cases
This model is particularly well-suited for applications requiring strong Korean language processing, such as:
- Korean text generation
- Question answering in Korean
- Sentiment analysis for Korean text
- General Korean NLP tasks