DavidAU/LFM2.5-1.2B-Thinking-Claude-4.6-Opus-Heretic-Uncensored-DISTILL

TEXT GENERATIONConcurrency Cost:1Model Size:1.2BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

DavidAU/LFM2.5-1.2B-Thinking-Claude-4.6-Opus-Heretic-Uncensored-DISTILL is a 1.2 billion parameter LFM2.5 model fine-tuned for deep reasoning using distilled reasoning datasets. This model features compact yet detailed reasoning capabilities and operates with a 32768 token context length. It is also a "Heretic" model, meaning it is fully uncensored and was de-censored prior to fine-tuning to ensure consistent behavior. Its primary strength lies in its enhanced reasoning and uncensored output generation, making it suitable for applications requiring direct and unfiltered responses.

Loading preview...

Model Overview

DavidAU/LFM2.5-1.2B-Thinking-Claude-4.6-Opus-Heretic-Uncensored-DISTILL is a 1.2 billion parameter LFM2.5 model, meticulously fine-tuned using distilled reasoning datasets. This process, conducted via Unsloth on local Linux hardware at 16-bit precision, completely replaced and enhanced the model's thinking and reasoning capabilities. The resulting reasoning is described as compact, detailed, and direct.

Key Features & Capabilities

  • Enhanced Reasoning: The model's core reasoning process has been entirely re-tuned, impacting general operation, output generation, and benchmark performance.
  • Uncensored ("Heretic"): This model is fully uncensored, having undergone a "Heretic" process before fine-tuning, which helps correct issues typically associated with post-censorship removal. It aims to provide direct responses without refusal.
  • Extended Context Window: Supports a substantial 32768 token context length.
  • Temperature Stability: Reasoning is stable across a wide temperature range of 0.1 to 2.5.

Performance & Usage Notes

Benchmarks show improvements over the base Heretic LFM2.5-1.2B-Thinking-q8 model across various tasks, including arc_easy, boolq, hellaswag, openbookqa, piqa, and winogrande. For optimal performance, the creator suggests using q5, q6, q8, or 16-bit precision, or Imatrix IQ3_M. A repetition penalty of 1.05 to 1.1 is recommended, and lowering the temperature to 0.3-0.7 can mitigate looping during thinking. For chat and roleplay, or smoother operation, setting a "Smoothing_factor" to 1.5 in interfaces like KoboldCpp, oobabooga/text-generation-webui, or Silly Tavern is advised. The model may require explicit directives (e.g., including specific slang or terms) to generate highly graphic or explicit content, even though it is uncensored.