DreamFast/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark
DreamFast/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark is a 35.1 billion parameter Qwen3.6-A3B model recovered from a Q8_0 quantized GGUF, featuring a MoE hybrid Gated DeltaNet + Gated Attention architecture with 256 experts. This model has been specifically modified to achieve 0% refusal rates, effectively removing safety alignments present in the base model. It is optimized for use cases requiring uncensored content generation and can be deployed with HuggingFace transformers or vLLM.
Loading preview...
Model Overview
This model, DreamFast/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark, is a 35.1 billion parameter Qwen3.6-A3B variant. It was meticulously recovered into HuggingFace safetensors format from a Q8_0 quantized GGUF originally published by HauhauCS. The model utilizes a Qwen3_5MoeForConditionalGeneration architecture, incorporating a MoE hybrid Gated DeltaNet + Gated Attention with 256 experts, 8 of which are active per token.
Key Differentiators
- Uncensored Performance: Achieves a 0% refusal rate on harmful prompts, a significant reduction from the base model's 40%, indicating effective abliteration of safety alignments.
- Bit-Exact Recovery: All 693 GGUF-derived tensors were verified bit-exact during conversion, ensuring fidelity to the GGUF source.
- Integrated Vision and MTP: Multi-Token Prediction (MTP) and vision encoder tensors, absent in the GGUF, were copied verbatim and bit-exact from the official
Qwen3.6-35B-A3Breference model. - Targeted Modifications: Analysis shows 386 tensors were modified for abliteration, primarily affecting expert and shared expert projections, while router gates and normalization layers remained untouched.
Intended Use Cases
- Unrestricted Content Generation: Ideal for applications requiring responses without refusal or safety filtering.
- Research into Model Alignment: Useful for studying the effects of abliteration and understanding how safety mechanisms are implemented and removed in large language models.
- High-Performance Inference: Supports efficient deployment with HuggingFace transformers and vLLM, including FP8 quantization and tensor parallelism.