Name: DavidAU/Mistral-Nemo-Instruct-2407-12B-Thinking-M-Claude-Opus-High-Reasoning API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DavidAU

Model Overview: Mistral-Nemo-Instruct-2407-12B-Thinking-M-Claude-Opus-High-Reasoning

This 12 billion parameter model, fine-tuned by DavidAU using Unsloth, is an instruction-tuned variant of Mistral Nemo. Its primary differentiator is a specialized fine-tuning for "thinking/reasoning" tasks, leveraging a dataset derived from Claude Opus 4.5 High Reasoning. The "M" in its name signifies a medium level of this reasoning tune, offering a balance between reasoning depth and a lighter computational footprint compared to its "HI" counterpart.

Key Capabilities & Features

Enhanced Reasoning: Designed to produce compact and precise reasoning outputs, rather than verbose explanations.
Temperature Agnostic Reasoning: The model's thinking activation is not significantly affected by temperature settings (recommended range: 0.1 to 2.5+).
Flexible Output Control: Adjusting the repetition penalty (e.g., to 1.0) can lead to longer reasoning blocks and potentially higher quality output.
Context Length: Supports a substantial context window of 32768 tokens, with a minimum suggested context of 4k, ideally 8k+.
No System Prompt Required: Thinking tags/blocks are self-generated by the model.
Optimized for Specific Quants: Recommends Q4KS (non-imatrix) or IQ3_M (imatrix) or higher for optimal reasoning performance.

Performance & Benchmarks

While specific benchmarks for the "reasoning" fine-tuning are not yet available, the base Mistral Nemo model demonstrates strong performance across various benchmarks:

MMLU (5-shot): 68.0%
HellaSwag (0-shot): 83.5%
Multilingual MMLU: Scores ranging from 59.0% (Chinese, Japanese) to 64.6% (Spanish).

Good For

Applications requiring concise and logical reasoning.
Use cases where complex problem-solving and analytical thought are crucial.
Scenarios benefiting from a large context window for detailed input analysis.
Developers looking for a model that can self-generate thought processes without explicit system prompts.

Overview

Model Overview: Mistral-Nemo-Instruct-2407-12B-Thinking-M-Claude-Opus-High-Reasoning

Key Capabilities & Features

Performance & Benchmarks

Good For

Full Model Card (README)