norallm/normistral-11b-thinking

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Oct 13, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

NorMistral-11B-Thinking is an instruction-tuned language model developed by norallm, specifically designed for Norwegian. This 11 billion parameter model incorporates a unique 'thinking' capability, allowing it to generate internal reasoning traces alongside its responses. It is optimized for lower-resource languages through fluency-preserving reinforcement learning, making it particularly effective for Norwegian language tasks.

Loading preview...

NorMistral-11B-Thinking Overview

NorMistral-11B-Thinking is an 11 billion parameter instruction-tuned language model developed by norallm, specifically optimized for the Norwegian language. It is built upon the NorMistral-11B base model and has undergone extensive post-training using a novel fluency-preserving reinforcement learning approach, as detailed in their paper "Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages".

Key Capabilities

  • Norwegian Language Proficiency: Excels in understanding and generating Norwegian (Bokmål and Nynorsk) text.
  • "Thinking" Capability: Generates an internal reasoning trace (enclosed in <think>...</think> tokens) before producing the final response, offering transparency into its decision-making process.
  • Instruction Following: Highly capable of following complex instructions due to supervised finetuning on English responses and reasoning traces from Kimi-K2-Thinking.
  • Reinforcement Learning: Utilizes direct reinforcement learning from AI feedback (d-RLAIF) with Mistral-Large-Instruct-2411 as the reward model, enhancing its fluency and alignment.
  • Apache 2.0 License: Freely available for use, with model weights under an open license.

Evaluation Highlights

NorMistral-11B-Thinking demonstrates strong performance on a generative version of NorEval, particularly in classification tasks like NoReC sentiment analysis and NorIdiom, often outperforming models like Llama-3.1-8B and Mistral-Nemo-12B in specific Norwegian benchmarks. While other models may lead in certain areas (e.g., Qwen3-15B* in NorCSQA and NorOBQA), NorMistral-11B* shows competitive results and unique strengths in Norwegian language understanding.

Good for

  • Norwegian Language Applications: Ideal for chatbots, content generation, and language understanding tasks specifically in Norwegian.
  • Research and Development: Useful for researchers exploring reinforcement learning, lower-resource language models, and transparent AI reasoning.
  • Local Deployment: Available in various GGUF formats for efficient local inference with ollama or llama.cpp.