werty1248/Llama-3-Ko-8B-OpenOrca

Warm
Public
8B
FP8
8192
License: llama3
Hugging Face
Overview

Model Overview

werty1248/Llama-3-Ko-8B-OpenOrca is an 8 billion parameter language model based on the Llama 3 architecture, specifically fine-tuned for the Korean language. It leverages the beomi/Llama-3-Open-Ko-8B as its base model and was further trained on the kyujinpy/OpenOrca-KO dataset.

Training Details

The model underwent 4 epochs of LoRA-8bit training using Axolotl, with a sequence length of 4096 and bf16 precision. The training process took approximately 6 hours on A6000x2 GPUs.

Evaluation and Performance

Evaluations were conducted using the kobest benchmark, showcasing its capabilities in Korean language understanding. Key results include:

  • 0-shot Performance:
    • kobest_boolq: 0.5021 accuracy
    • kobest_copa: 0.6920 accuracy
    • kobest_sentineg: 0.7330 accuracy
  • 5-shot Performance:
    • kobest_boolq: 0.7123 accuracy
    • kobest_copa: 0.7620 accuracy
    • kobest_sentineg: 0.9446 accuracy

These results indicate its proficiency in various Korean natural language understanding tasks, particularly with few-shot prompting.

Licensing

The model adheres to the Llama 3 license, which can be reviewed at https://llama.meta.com/llama3/license.

Use Cases

This model is particularly well-suited for applications requiring strong Korean language processing, such as:

  • Korean text generation
  • Question answering in Korean
  • Sentiment analysis for Korean text
  • General Korean NLP tasks