DavidAU/Qwen3.6-12B-IQ-Ultra-Heretic-Uncensored-Thinking-V2-Hightop
DavidAU/Qwen3.6-12B-IQ-Ultra-Heretic-Uncensored-Thinking-V2-Hightop is a 27 billion parameter Qwen 3.6 model, initially uncensored and then 'shrunk' to 12 billion parameters and 24 layers via a modified Mergekit. This model was subsequently fine-tuned over two stages on six datasets using Unsloth, with specific tuning to unify its new layer structure. It features a 256k context length and is optimized for general and specific use cases, including creative tasks, with intact image/video training systems from the original 27B model.
Loading preview...
Model Overview
DavidAU/Qwen3.6-12B-IQ-Ultra-Heretic-Uncensored-Thinking-V2-Hightop is a 27 billion parameter Qwen 3.6 base model that underwent a unique transformation. It was initially uncensored using 'Heretic' by P-E-W, then reduced to a 12 billion parameter model with 24 layers (from 64) via a modified Mergekit process. This 12B model was subsequently fine-tuned in two stages on six datasets using Unsloth, with a focus on unifying its new layer structure.
Key Capabilities & Features
- Parameter Count: 12 billion parameters (derived from a 27B model).
- Context Length: Supports a substantial 256k context.
- Uncensored Nature: Retains its uncensored characteristics from the initial 'Heretic' modification.
- Layer Structure: Reduced to 24 layers, contributing to faster inference (e.g., 150 t/s on Q4KS with a 5090 GPU).
- Multimodal Potential: Image/video training systems from the original 27B model are intact.
- Fine-tuning: Tuned for general and specific use cases, including creative tasks, with some math, code, and reasoning incorporated.
Important Considerations
- Tuning Requirements: The model may require additional tuning (25-50k samples minimum) for specific or general use cases to reach its 'full power'.
- Knowledge Gaps: Due to the unique compression method, some knowledge or skills from the original 27B model might be missing.
- Performance Relative to Qwen 3.5 9B: The Qwen 3.5 9B model may outperform this model until it is fully tuned.
Good For
- Developers looking for an uncensored Qwen-based model with a reduced parameter count for faster inference.
- Experimentation with fine-tuning on local hardware (12-16 GB VRAM) or Google Colab.
- Creative generation tasks and general use cases where its unique tuning might be beneficial.