ENERGY-DRINK-LOVE/nox_DPOv3
ENERGY-DRINK-LOVE/nox_DPOv3 is a 10.7 billion parameter language model developed by Youjin Chung and Jingyeom Kim, fine-tuned using Direct Preference Optimization (DPO). It is based on the davidkim205/nox-solar-10.7b-v4 base model and trained on a custom DPO dataset, including translated English datasets. This model demonstrates strong performance on Korean language benchmarks, particularly in tasks like KoBEST BoolQ.
Loading preview...
Model Overview
ENERGY-DRINK-LOVE/nox_DPOv3 is a 10.7 billion parameter language model developed by Youjin Chung and Jingyeom Kim. It is built upon the davidkim205/nox-solar-10.7b-v4 base model and has been fine-tuned using the Direct Preference Optimization (DPO) method.
Key Training Details
- Base Model: davidkim205/nox-solar-10.7b-v4
- Training Method: Direct Preference Optimization (DPO), as described in arxiv.org/abs/2305.18290.
- Dataset: A proprietary DPO dataset, which includes data derived from AI-hub and translated English datasets like OpenOrca DPO, utilizing a custom translation model.
- Hardware: Trained on 8 A100 GPUs using Deepspeed and Huggingface TRL Trainer.
Performance Highlights
The model's performance was evaluated using the Ko LM Eval Harness (macro f1 scores for 0-shot tasks):
- kobest_boolq: 0.931613
- kobest_copa: 0.740751
- kobest_hellaswag: 0.468602
- kobest_sentineg: 0.488465
Intended Use Cases
This model is particularly well-suited for applications requiring strong performance in Korean language understanding and generation, especially tasks related to question answering and natural language inference, as indicated by its high score on KoBEST BoolQ.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.