kyujinpy/Kosy-Platypus2-13B
kyujinpy/Kosy-Platypus2-13B is a 13 billion parameter causal language model developed by Kyujin Han (kyujinpy), fine-tuned using the NEFTune method. Based on the hyunseoki/ko-en-llama2-13b architecture, this model is optimized for Korean language tasks, demonstrating competitive performance on the KO-LLM leaderboard. It is particularly suited for applications requiring robust Korean language understanding and generation.
Loading preview...
Model Overview
kyujinpy/Kosy-Platypus2-13B, also known as Kosy🍵llama, is a 13 billion parameter language model developed by Kyujin Han (kyujinpy). It is built upon the hyunseoki/ko-en-llama2-13b base model and has been fine-tuned using the NEFTune (Noisy Embedding Fine-Tuning) method. This approach incorporates random noisy embeddings during training to enhance model performance.
Key Capabilities & Features
- Korean Language Optimization: Specifically trained and evaluated for Korean language tasks, leveraging the
kyujinpy/KOpen-platypusdataset. - NEFTune Method: Utilizes the NEFTune technique, which is detailed in the associated KoNEFTune GitHub repository, allowing for easy application of this fine-tuning approach.
- Performance Benchmarking: Benchmarked on the KO-LLM leaderboard, showing competitive results across various Korean language metrics such as Ko-ARC, Ko-HellaSwag, Ko-MMLU, Ko-TruthfulQA, and Ko-CommonGen V2.
Performance Highlights
The model's performance, particularly the NEFT(🍵kosy)+MLP-v3 variant (which this model is), achieved an average score of 46.31 on the KO-LLM leaderboard, outperforming the base Ko-Platypus2-13B model. Specific scores include 43.34 on Ko-ARC, 54.54 on Ko-HellaSwag, 43.38 on Ko-MMLU, 44.11 on Ko-TruthfulQA, and 46.16 on Ko-CommonGen V2.
Ideal Use Cases
This model is well-suited for developers and researchers focusing on:
- Applications requiring high-quality Korean language generation and understanding.
- Experimentation with NEFTune-based fine-tuning for improved model robustness and performance.
- Tasks that benefit from a strong performance baseline on established Korean language benchmarks.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.