yanolja/YanoljaNEXT-EEVE-Instruct-10.8B
YanoljaNEXT-EEVE-Instruct-10.8B is a 10.8 billion parameter instruction-tuned causal language model developed by Yanolja. It is a Korean vocabulary-extended version of upstage/SOLAR-10.7B-v1.0, fine-tuned using Direct Preference Optimization (DPO) with Axolotl. This model is specifically optimized for conversational tasks in Korean, leveraging translated datasets like SlimOrca-Dedup and Ultrafeedback-Binarized-Preferences-Cleaned. It achieves an average score of 66.48 on the Open LLM Leaderboard, demonstrating strong performance across various benchmarks including MMLU and HellaSwag.
Loading preview...
Overview
YanoljaNEXT-EEVE-Instruct-10.8B is a 10.8 billion parameter instruction-tuned language model developed by Yanolja. It is built upon the yanolja/EEVE-Korean-10.8B-v1.0 base model, which itself is a Korean vocabulary-extended version of upstage/SOLAR-10.7B-v1.0. The model was fine-tuned using Direct Preference Optimization (DPO) with the Axolotl framework, focusing on enhancing its conversational abilities.
Key Capabilities
- Korean Language Proficiency: Specifically designed and optimized for understanding and generating Korean text, leveraging a vocabulary expansion technique detailed in their technical report.
- Instruction Following: Fine-tuned with instruction datasets to provide helpful, detailed, and polite answers in a chat-based format.
- Preference Alignment: Utilizes Direct Preference Optimization (DPO) to align model outputs with human preferences, leading to more desirable responses.
- Benchmark Performance: Achieves an average score of 66.48 on the Open LLM Leaderboard, with notable scores such as 64.23 on MMLU (5-Shot) and 83.04 on HellaSwag (10-Shot).
Training Data
The model was trained on Korean-translated versions of Open-Orca/SlimOrca-Dedup and argilla/ultrafeedback-binarized-preferences-preferences-cleaned datasets.
Good For
- Developing Korean-language chatbots and conversational AI applications.
- Tasks requiring instruction-following and polite, detailed responses in Korean.
- Research and development in multilingual LLMs, particularly for Korean language integration.