Greytechai/LFM2.5-1.2B-Thinking-Kimi-V2-Heretic-Uncensored-DISTILL

TEXT GENERATIONConcurrency Cost:1Model Size:1.2BQuant:BF16Ctx Length:32kPublished:Mar 18, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Greytechai/LFM2.5-1.2B-Thinking-Kimi-V2-Heretic-Uncensored-DISTILL is a 1.2 billion parameter LFM2.5-based model, fine-tuned by Greytechai using Unsloth with distill reasoning datasets. It features a 32768-token context length and is specifically designed for deep, detailed reasoning, with its thinking/reasoning capabilities completely replaced during fine-tuning. This model is also 'Heretic' uncensored, meaning it was de-censored prior to tuning to ensure uninhibited output generation.

Loading preview...

Model Overview

Greytechai/LFM2.5-1.2B-Thinking-Kimi-V2-Heretic-Uncensored-DISTILL is a 1.2 billion parameter language model built on the LFM2.5 architecture. It has undergone a specialized fine-tuning process using Unsloth and distill reasoning datasets, resulting in a complete overhaul of its thinking and reasoning capabilities. The model is designed to provide compact yet highly detailed reasoning, directly addressing prompts without excessive verbosity.

Key Capabilities & Features

  • Enhanced Reasoning: The model's core reasoning mechanism has been entirely replaced and optimized for deep, detailed thought processes.
  • Uncensored Output: As a 'Heretic' model, it was de-censored before tuning, ensuring it does not refuse requests and generates content as directed, including potentially sensitive or explicit material.
  • Stable Reasoning: Its reasoning capabilities are noted to be stable across a temperature range of 0.1 to 2.5.
  • Extended Context: Supports a substantial context length of 32768 tokens.

Optimal Usage & Settings

For best performance, the model recommends using q5, q6, q8, or 16-bit precision, or Imatrix IQ3_M quantization. A repetition penalty of 1.05 to 1.1 is suggested. Users experiencing looping during thinking should lower the temperature to 0.3-0.7. For chat and roleplay, setting a 'Smoothing_factor' (or 'Smoothing') to 1.5 in interfaces like KoboldCpp, oobabooga, or Silly Tavern is highly recommended to achieve smoother operation.