Name: Kukedlc/NeuTrixOmniBe-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Kukedlc

NeuTrixOmniBe-DPO Overview

NeuTrixOmniBe-DPO is a 7 billion parameter language model developed by Kukedlc. It was constructed through a strategic merge of two base models, CultriX/NeuralTrix-7B-dpo and paulml/OmniBeagleSquaredMBX-v3-7B-v2, utilizing the LazyMergekit tool. Following this merge, the model underwent further refinement through Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs dataset.

Key Capabilities & Performance

This model demonstrates solid performance across a range of benchmarks, achieving an average score of 76.17 on the Open LLM Leaderboard. Notable benchmark results include:

HellaSwag (10-Shot): 89.03
Winogrande (5-shot): 85.16
TruthfulQA (0-shot): 77.21
AI2 Reasoning Challenge (25-Shot): 72.78
GSM8k (5-shot): 68.54
MMLU (5-Shot): 64.28

Training & Architecture

The model's architecture is a result of a slerp merge, combining layers from its base models. The DPO training phase aims to align the model's outputs more closely with human preferences, enhancing its ability to generate high-quality and relevant responses. The development process was informed by knowledge gained from Maxime Labonne's LLM course.

Important Note

As indicated in the original README, there is a known "Bug INSTINST in response" which suggests potential issues in its output generation that users should be aware of.

Overview

NeuTrixOmniBe-DPO Overview

Key Capabilities & Performance

Training & Architecture

Important Note

Full Model Card (README)