alexgusevski/Mistral-Nemo-Instruct-2407-heretic-noslop-mlx-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Jan 12, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The alexgusevski/Mistral-Nemo-Instruct-2407-heretic-noslop-mlx-fp16 is a 12 billion parameter instruction-tuned language model, converted to the MLX format by alexgusevski from the original p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop model. This model supports a 32768 token context length and is designed for efficient deployment and inference on Apple silicon via the MLX framework. It is suitable for general instruction-following tasks, leveraging the Mistral architecture for robust performance.

Loading preview...

Overview

This model, alexgusevski/Mistral-Nemo-Instruct-2407-heretic-noslop-mlx-fp16, is a 12 billion parameter instruction-tuned language model. It has been specifically converted to the MLX format by alexgusevski from the original p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop model, utilizing mlx-lm version 0.29.1. This conversion enables optimized performance and inference on Apple silicon.

Key Capabilities

  • Instruction Following: Designed to understand and execute user instructions effectively.
  • MLX Compatibility: Fully compatible with the MLX framework for efficient local deployment.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more coherent responses.

Good For

  • Developers working with Apple silicon who require an instruction-tuned model.
  • Applications needing a 12B parameter model with a large context window for general language tasks.
  • Experimentation and deployment of instruction-following LLMs within the MLX ecosystem.