Name: macadeliccc/MBX-7B-v3-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: macadeliccc

MBX-7B-v3-DPO Overview

MBX-7B-v3-DPO is a 7 billion parameter language model developed by macadeliccc, built upon the flemmingmiguel/MBX-7B-v3 base model. It has been further refined using Direct Preference Optimization (DPO) with the jondurbin/truthy-dpo-v0.1 dataset, aiming to enhance its instruction-following capabilities and response quality.

Key Capabilities & Performance

Improved Instruction Following: The DPO fine-tuning process has led to a notable improvement in conversational quality, as indicated by its EQ-Bench v2 score of 74.32, surpassing the base model's 73.87.
Benchmark Performance: On the Open LLM Leaderboard, MBX-7B-v3-DPO achieves an average score of 76.13, with strong results in:
- HellaSwag (10-Shot): 89.11
- Winogrande (5-shot): 85.56
- TruthfulQA (0-shot): 74.00
- GSM8k (5-shot): 69.67
Context Length: Supports an 8192-token context window, allowing for processing longer prompts and generating more extensive responses.

Deployment Options

Quantized Versions: Available in various quantized formats, including GGUF and Exllamav2, offering flexibility for deployment on hardware with limited VRAM. Exllamav2 quantizations range from 8.0-bit (8.4 GB VRAM) down to 3.5-bit (4.7 GB VRAM), with the 6.5-bit version recommended for a balance of quality and size.

Good for

General Chat Applications: Its DPO fine-tuning makes it well-suited for engaging in conversational AI and instruction-based tasks.
Truthful and Coherent Responses: The focus on truthfulness during DPO training suggests improved factual accuracy and reduced hallucination.
Resource-Constrained Environments: The availability of various quantized versions makes it adaptable for deployment on consumer-grade GPUs.

Overview

MBX-7B-v3-DPO Overview

Key Capabilities & Performance

Deployment Options

Good for

Full Model Card (README)