Overview
Model Overview
The electroglyph/Qwen3-4B-Instruct-2507-uncensored-unslop-v2 is a 4 billion parameter instruction-tuned model built upon the Qwen3 architecture. This version is a specialized GRPO (Generative Reinforcement Learning with Policy Optimization) finetune of the Qwen3-4B-Instruct-2507-uncensored base model, with the primary goal of reducing 'slop' – verbose or repetitive output.
Key Characteristics
- Slop Mitigation: Utilizes a GRPO finetuning approach to reduce verbosity and improve conciseness in generated text, addressing an issue present in its uncensored predecessor.
- Distinct Style: Offers a different output style compared to regular Qwen3 4B 2507 models, influenced by the Gemma writing style due to the uncensoring dataset's origin.
- Uncensored Base: Retains the uncensored nature of its base model while aiming for more refined output.
- GGUF Availability: A UD-Q4_K_XL GGUF version is provided, with settings derived from Unsloth's quant utility.
Use Cases
This model is particularly well-suited for applications where:
- Concise and direct responses are preferred over verbose output.
- An uncensored model is required, but with improved output quality and reduced 'slop'.
- Users are looking for a Qwen3-based model with a unique stylistic characteristic, potentially influenced by Gemma's writing style, but with mitigated verbosity.