MuXodious/Mistral-Nemo-Instruct-2407-absolute-heresy

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

MuXodious/Mistral-Nemo-Instruct-2407-absolute-heresy is a 12 billion parameter instruction-tuned language model, fine-tuned from the Mistral-Nemo-Instruct-2407 base model using P-E-W's Heretic ablation engine. Developed by Mistral AI and NVIDIA, it features a 32768-token context window and is trained on extensive multilingual and code data. This model is specifically characterized by its "Absolute Heresy" classification, indicating a low refusal rate and KL Divergence, making it suitable for applications requiring less constrained outputs.

Loading preview...

MuXodious/Mistral-Nemo-Instruct-2407-absolute-heresy Overview

This model is a 12 billion parameter instruction-tuned variant of the Mistral-Nemo-Instruct-2407 base model, jointly developed by Mistral AI and NVIDIA. It has been fine-tuned using P-E-W's Heretic ablation engine, resulting in a model classified with an "Absolute Heresy" index. This classification signifies a low refusal rate (4/100) and KL Divergence (0.0467), suggesting a model that is less prone to refusals compared to its initial state (87/100 initial refusals).

Key Capabilities

  • Robust Instruction Following: Fine-tuned for instruction adherence, building on the strong base of Mistral-Nemo-Instruct-2407.
  • Extended Context Window: Supports a substantial 32768-token context window, enabling processing of longer inputs.
  • Multilingual & Code Proficiency: Trained on a significant proportion of multilingual and code data, enhancing its versatility.
  • Function Calling Support: Capable of function calling, demonstrated with examples for mistral_inference and transformers frameworks.
  • Apache 2.0 Licensed: Available under a permissive license for broad use.

Good For

  • Applications requiring less constrained outputs: The "Absolute Heresy" classification indicates a model with a significantly reduced refusal rate, potentially useful for creative or less restrictive generation tasks.
  • Developers familiar with Mistral-Nemo: Acts as a drop-in replacement for Mistral 7B and integrates well with mistral_inference, transformers, and NeMo frameworks.
  • Multilingual and Code-centric tasks: Its training on diverse language and code datasets makes it suitable for global and programming-related applications.