p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Jan 11, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop is a 12 billion parameter instruction-tuned causal language model, a slop-reduced version of Mistral AI and NVIDIA's Mistral-Nemo-Instruct-2407. Developed by p-e-w using the Heretic framework, this model is specifically optimized to reduce "slop" or undesirable responses, demonstrating improved output quality compared to its original counterpart. It features a 32768 token context window and is suitable for general instruction-following tasks where cleaner, more focused outputs are desired.

Loading preview...

Model Overview

This model, p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop, is a 12 billion parameter instruction-tuned variant of the Mistral-Nemo-Instruct-2407 model, developed jointly by Mistral AI and NVIDIA. The key differentiator for this specific version is its "slop-reduced" nature, achieved through processing with a development version of the Heretic framework by p-e-w. This optimization aims to produce cleaner, more focused outputs by reducing irrelevant or undesirable "slop" in responses.

Key Capabilities & Features

  • Slop Reduction: Significantly reduces the occurrence of "slop" in generated responses, with 63/100 responses containing slop compared to 94/100 in the original model.
  • Mistral-Nemo Architecture: Based on the Mistral-Nemo-Instruct-2407, featuring a 128k context window (though the model card states 32768 tokens for this specific version) and trained on a large proportion of multilingual and code data.
  • Instruction Following: Designed for general instruction-following tasks, leveraging the fine-tuning of the base Mistral-Nemo model.
  • Multilingual Support: Inherits multilingual capabilities from the base model, with reported MMLU scores across various languages including French, German, Spanish, and Chinese.

Good For

  • Applications requiring cleaner outputs: Ideal for use cases where reducing irrelevant or verbose text is critical.
  • General instruction-following: Suitable for a wide range of tasks where a model needs to adhere closely to given instructions.
  • Multilingual tasks: Can be applied to tasks involving multiple languages due to its training data composition.

Limitations

  • The model does not include moderation mechanisms and may produce unmoderated outputs, requiring external guardrails for sensitive applications.