ewald1976/nemo-crownelius-st-12b
The ewald1976/nemo-crownelius-st-12b is a 12 billion parameter language model based on the Mistral-NeMo architecture, featuring a 32768 token context length. This model utilizes a unique "surgical style-tuning" methodology, where only the lm_head is retrained, preserving core reasoning while significantly altering linguistic style. It is specifically fine-tuned on the Crownelius/Opus-4.5-3000x dataset to produce rich, varied, and sophisticated prose.
Loading preview...
Overview
ewald1976/nemo-crownelius-st-12b is a 12 billion parameter model built upon the Mistral-NeMo-12B architecture, distinguished by its innovative "surgical style-tuning" approach. Unlike traditional fine-tuning, this method exclusively targets the lm_head (output projection) while freezing all other layers, including attention mechanisms and MLP layers (0–39). This preserves the model's inherent reasoning, instruction-following, and general knowledge capabilities.
Key Capabilities
- Sophisticated Prose Generation: Rewires linguistic preferences to generate rich, varied, and sophisticated prose, moving beyond generic "AI-slop" vocabulary.
- Preserved Core Intelligence: Maintains the underlying logic and instruction-following abilities of the base Mistral-NeMo model.
- Efficient Style Transfer: Achieves significant stylistic changes with minimal training (1 epoch, 2e-4 learning rate) by focusing solely on the output layer.
Training & Dataset
The model was trained using the Crownelius/Opus-4.5-3000x dataset, specifically chosen for its specialized, modern prose. The training process was highly constrained to ensure the lm_head absorbed the stylistic texture without semantic overfitting.
Recommended Usage
This model is ideal for applications requiring a distinct and refined writing style, where the base model's intelligence needs to be retained but its output prose needs a significant stylistic upgrade. Recommended sampler settings include a Temperature of 0.7-0.9, Min_P of 0.05, Top_P of 0.95, and a Repetition Penalty of 1.05.