Model Overview
This model, p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop, is a 12 billion parameter instruction-tuned variant of the Mistral-Nemo-Instruct-2407 model, developed jointly by Mistral AI and NVIDIA. The key differentiator for this specific version is its "slop-reduced" nature, achieved through processing with a development version of the Heretic framework by p-e-w. This optimization aims to produce cleaner, more focused outputs by reducing irrelevant or undesirable "slop" in responses.
Key Capabilities & Features
- Slop Reduction: Significantly reduces the occurrence of "slop" in generated responses, with 63/100 responses containing slop compared to 94/100 in the original model.
- Mistral-Nemo Architecture: Based on the Mistral-Nemo-Instruct-2407, featuring a 128k context window (though the model card states 32768 tokens for this specific version) and trained on a large proportion of multilingual and code data.
- Instruction Following: Designed for general instruction-following tasks, leveraging the fine-tuning of the base Mistral-Nemo model.
- Multilingual Support: Inherits multilingual capabilities from the base model, with reported MMLU scores across various languages including French, German, Spanish, and Chinese.
Good For
- Applications requiring cleaner outputs: Ideal for use cases where reducing irrelevant or verbose text is critical.
- General instruction-following: Suitable for a wide range of tasks where a model needs to adhere closely to given instructions.
- Multilingual tasks: Can be applied to tasks involving multiple languages due to its training data composition.
Limitations
- The model does not include moderation mechanisms and may produce unmoderated outputs, requiring external guardrails for sensitive applications.