Kukedlc/NeuralKrishna-7B-V2-DPO
NeuralKrishna-7B-V2-DPO is a 7 billion parameter causal language model developed by Kukedlc, fine-tuned using Direct Preference Optimization (DPO) for enhanced instruction following. This model achieves an average score of 76.00 on the Open LLM Leaderboard, demonstrating strong performance across various reasoning and language understanding tasks. It is particularly suited for general-purpose conversational AI and tasks requiring nuanced response generation.
Loading preview...
NeuralKrishna-7B-V2-DPO Overview
NeuralKrishna-7B-V2-DPO is a 7 billion parameter causal language model developed by Kukedlc, fine-tuned using Direct Preference Optimization (DPO). This fine-tuning approach aims to align the model's outputs more closely with human preferences, enhancing its ability to follow instructions and generate high-quality responses. The model leverages a LoRA configuration during training, with specific target modules including k_proj, gate_proj, v_proj, up_proj, q_proj, o_proj, and down_proj.
Key Capabilities & Performance
The model's performance has been evaluated on the Open LLM Leaderboard, achieving an average score of 76.00. Notable results include:
- AI2 Reasoning Challenge (25-Shot): 74.06
- HellaSwag (10-Shot): 88.97
- MMLU (5-Shot): 64.41
- TruthfulQA (0-shot): 76.19
- Winogrande (5-shot): 84.29
- GSM8k (5-shot): 68.08
These scores indicate strong general reasoning, common sense, and language understanding abilities. The DPO fine-tuning process, utilizing a max_prompt_length of 1024 and max_length of 1536, contributes to its proficiency in generating coherent and contextually relevant text.
When to Use This Model
NeuralKrishna-7B-V2-DPO is well-suited for applications requiring a capable 7B parameter model with improved instruction following and general conversational abilities. Its balanced performance across various benchmarks makes it a strong candidate for tasks such as:
- General-purpose chatbots and virtual assistants.
- Content generation where nuanced and preference-aligned responses are desired.
- Reasoning tasks and question answering.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.