puwaer/Qwen3-4B-Thinking-2507-GRPO-Uncensored
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 9, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

puwaer/Qwen3-4B-Thinking-2507-GRPO-Uncensored is an uncensored language model based on Qwen3-4B-Thinking-2507, fine-tuned by puwaer using a three-stage process involving SFT, SimPO, and GRPO. This model is specifically optimized to bypass safety boundaries, achieving an extremely low refusal rate of under 4-5% on safety benchmarks. It is designed for generating harmful or unrestricted responses while attempting to recover conversational intelligence lost during uncensoring procedures.

Loading preview...