Mistral-Nemo-Instruct-2407-abliterated is a 12 billion parameter large language model, jointly trained by Mistral AI and NVIDIA, featuring a 32K context window. This version has undergone ablation to reduce strong refusal directions, while maintaining performance. It is optimized for multilingual and code data, serving as a drop-in replacement for Mistral 7B.
Overview
natong19/Mistral-Nemo-Instruct-2407-abliterated is an ablated version of the Mistral-Nemo-Instruct-2407 model, a 12 billion parameter Large Language Model (LLM) developed jointly by Mistral AI and NVIDIA. This model is designed to offer strong performance, outperforming other models of similar or smaller scale.
Key Features
- Ablated Refusal Directions: The model's most prominent refusal behaviors have been reduced through weight orthogonalization, though it may still exhibit some refusal or provide unsolicited advice.
- Extended Context Window: Trained with a substantial 128k context window, enabling processing of longer inputs and generating more coherent, extended outputs.
- Multilingual and Code Proficiency: Benefits from training on a significant proportion of multilingual and code-specific data, enhancing its capabilities in these domains.
- Drop-in Replacement: Positioned as a direct replacement for Mistral 7B, suggesting compatibility and potentially improved performance for existing applications.
Performance Highlights
Evaluations using lm-evaluation-harness 0.4.2 show competitive performance:
- MMLU (5-shot): Achieves 68.8
- GSM8K (5-shot): Scores 75.2
- TruthfulQA (0-shot): Reaches 55.0
These scores indicate its general reasoning and knowledge capabilities are well-maintained post-ablation.
Use Cases
This model is suitable for applications requiring a powerful LLM with reduced refusal tendencies, particularly in multilingual environments or for code-related tasks. Its large context window makes it ideal for complex conversations, document analysis, and code generation where extensive context is beneficial.