kyujinpy/Kosy-Platypus2-13B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:cc-by-nc-sa-4.0Architecture:Transformer Open Weights Warm

kyujinpy/Kosy-Platypus2-13B is a 13 billion parameter causal language model developed by Kyujin Han (kyujinpy), fine-tuned using the NEFTune method. Based on the hyunseoki/ko-en-llama2-13b architecture, this model is optimized for Korean language tasks, demonstrating competitive performance on the KO-LLM leaderboard. It is particularly suited for applications requiring robust Korean language understanding and generation.

Loading preview...

Model Overview

kyujinpy/Kosy-Platypus2-13B, also known as Kosy🍵llama, is a 13 billion parameter language model developed by Kyujin Han (kyujinpy). It is built upon the hyunseoki/ko-en-llama2-13b base model and has been fine-tuned using the NEFTune (Noisy Embedding Fine-Tuning) method. This approach incorporates random noisy embeddings during training to enhance model performance.

Key Capabilities & Features

  • Korean Language Optimization: Specifically trained and evaluated for Korean language tasks, leveraging the kyujinpy/KOpen-platypus dataset.
  • NEFTune Method: Utilizes the NEFTune technique, which is detailed in the associated KoNEFTune GitHub repository, allowing for easy application of this fine-tuning approach.
  • Performance Benchmarking: Benchmarked on the KO-LLM leaderboard, showing competitive results across various Korean language metrics such as Ko-ARC, Ko-HellaSwag, Ko-MMLU, Ko-TruthfulQA, and Ko-CommonGen V2.

Performance Highlights

The model's performance, particularly the NEFT(🍵kosy)+MLP-v3 variant (which this model is), achieved an average score of 46.31 on the KO-LLM leaderboard, outperforming the base Ko-Platypus2-13B model. Specific scores include 43.34 on Ko-ARC, 54.54 on Ko-HellaSwag, 43.38 on Ko-MMLU, 44.11 on Ko-TruthfulQA, and 46.16 on Ko-CommonGen V2.

Ideal Use Cases

This model is well-suited for developers and researchers focusing on:

  • Applications requiring high-quality Korean language generation and understanding.
  • Experimentation with NEFTune-based fine-tuning for improved model robustness and performance.
  • Tasks that benefit from a strong performance baseline on established Korean language benchmarks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p