Overview
Overview
mlabonne/NeuralDaredevil-8B-abliterated is an 8 billion parameter model that has undergone DPO (Direct Preference Optimization) fine-tuning. This process was applied to the base model, mlabonne/Daredevil-8B-abliterated, using the mlabonne/orpo-dpo-mix-40k dataset over one epoch. The primary goal of this fine-tuning was to restore performance that was lost during the initial "abliteration" process, resulting in a highly capable uncensored model.
Key Capabilities & Performance
- Uncensored Output: Designed to provide responses without alignment constraints, making it suitable for diverse applications.
- Performance Recovery: The DPO fine-tuning successfully mitigates performance degradation from the abliteration of its base model.
- Leaderboard Recognition: Ranked as the best-performing uncensored 8B model on the Open LLM Leaderboard based on its MMLU score.
- Competitive Benchmarks: Achieves an average score of 55.87 in Nous evaluations, outperforming models like
meta-llama/Meta-Llama-3-8B-InstructandNousResearch/Hermes-2-Theta-Llama-3-8Bin its category.
Ideal Use Cases
- Role-playing: Its uncensored nature makes it particularly well-suited for creative and unconstrained role-playing scenarios.
- Applications Not Requiring Alignment: Can be used in any context where traditional safety or alignment filters are not desired or necessary.