MuXodious/Mistral-Nemo-Instruct-2407-absolute-heresy Overview
This model is a 12 billion parameter instruction-tuned variant of the Mistral-Nemo-Instruct-2407 base model, jointly developed by Mistral AI and NVIDIA. It has been fine-tuned using P-E-W's Heretic ablation engine, resulting in a model classified with an "Absolute Heresy" index. This classification signifies a low refusal rate (4/100) and KL Divergence (0.0467), suggesting a model that is less prone to refusals compared to its initial state (87/100 initial refusals).
Key Capabilities
- Robust Instruction Following: Fine-tuned for instruction adherence, building on the strong base of Mistral-Nemo-Instruct-2407.
- Extended Context Window: Supports a substantial 32768-token context window, enabling processing of longer inputs.
- Multilingual & Code Proficiency: Trained on a significant proportion of multilingual and code data, enhancing its versatility.
- Function Calling Support: Capable of function calling, demonstrated with examples for
mistral_inference and transformers frameworks. - Apache 2.0 Licensed: Available under a permissive license for broad use.
Good For
- Applications requiring less constrained outputs: The "Absolute Heresy" classification indicates a model with a significantly reduced refusal rate, potentially useful for creative or less restrictive generation tasks.
- Developers familiar with Mistral-Nemo: Acts as a drop-in replacement for Mistral 7B and integrates well with
mistral_inference, transformers, and NeMo frameworks. - Multilingual and Code-centric tasks: Its training on diverse language and code datasets makes it suitable for global and programming-related applications.