ewald1976/nemo-crownelius-st-12b

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The ewald1976/nemo-crownelius-st-12b is a 12 billion parameter language model based on the Mistral-NeMo architecture, featuring a 32768 token context length. This model utilizes a unique "surgical style-tuning" methodology, where only the lm_head is retrained, preserving core reasoning while significantly altering linguistic style. It is specifically fine-tuned on the Crownelius/Opus-4.5-3000x dataset to produce rich, varied, and sophisticated prose.

Loading preview...

Overview

ewald1976/nemo-crownelius-st-12b is a 12 billion parameter model built upon the Mistral-NeMo-12B architecture, distinguished by its innovative "surgical style-tuning" approach. Unlike traditional fine-tuning, this method exclusively targets the lm_head (output projection) while freezing all other layers, including attention mechanisms and MLP layers (0–39). This preserves the model's inherent reasoning, instruction-following, and general knowledge capabilities.

Key Capabilities

  • Sophisticated Prose Generation: Rewires linguistic preferences to generate rich, varied, and sophisticated prose, moving beyond generic "AI-slop" vocabulary.
  • Preserved Core Intelligence: Maintains the underlying logic and instruction-following abilities of the base Mistral-NeMo model.
  • Efficient Style Transfer: Achieves significant stylistic changes with minimal training (1 epoch, 2e-4 learning rate) by focusing solely on the output layer.

Training & Dataset

The model was trained using the Crownelius/Opus-4.5-3000x dataset, specifically chosen for its specialized, modern prose. The training process was highly constrained to ensure the lm_head absorbed the stylistic texture without semantic overfitting.

Recommended Usage

This model is ideal for applications requiring a distinct and refined writing style, where the base model's intelligence needs to be retained but its output prose needs a significant stylistic upgrade. Recommended sampler settings include a Temperature of 0.7-0.9, Min_P of 0.05, Top_P of 0.95, and a Repetition Penalty of 1.05.