Name: mlabonne/NeuralDaredevil-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlabonne

NeuralDaredevil-7B: DPO Fine-tune for Enhanced Instruction Following

NeuralDaredevil-7B is a 7 billion parameter language model developed by mlabonne, created by applying Direct Preference Optimization (DPO) to the existing mlabonne/Daredevil-7B model. The DPO fine-tuning utilized the argilla/distilabel-intel-orca-dpo-pairs preference dataset, aiming to improve the model's ability to follow instructions and generate high-quality responses.

Key Capabilities & Performance

DPO Fine-tuning: Leverages preference data to align model outputs more closely with human preferences.
Competitive Benchmarking: Achieves an average score of 59.39 on the Nous suite (AGIEval: 45.23, GPT4All: 76.2, TruthfulQA: 67.61, Bigbench: 48.52), positioning it favorably against similar 7B models like mlabonne/Beagle14-7B and argilla/distilabeled-Marcoro14-7B-slerp.
Open LLM Leaderboard: Records an average score of 74.12, with strong results in HellaSwag (87.62), Winogrande (82.08), and GSM8k (73.16).
Instruction Following: Uses the same prompt template as mistralai/Mistral-7B-Instruct-v0.2, ensuring compatibility with established instruction formats.

Good For

General-purpose instruction-following applications.
Conversational AI where response quality and alignment are crucial.
Developers seeking a DPO-tuned 7B model with solid benchmark performance.

Overview

NeuralDaredevil-7B: DPO Fine-tune for Enhanced Instruction Following

Key Capabilities & Performance

Good For

Full Model Card (README)