Overview
This model, richardyoung/Qwen2.5-14B-Instruct-1M-heretic, is a 14.8 billion parameter instruction-tuned causal language model based on the Qwen2.5 series. Its primary distinction is the removal of safety guardrails and refusal behaviors using the Heretic v1.0.1 tool, resulting in significantly fewer refusals (3/100 compared to 87/100 for the original model).
Key Capabilities
- Ultra-Long Context: Supports a context length of up to 1,010,000 tokens, with generation up to 8,192 tokens. It demonstrates improved performance in handling long-context tasks.
- Decensored Output: Modified to suppress refusal directions, allowing for generation of content that the original model might refuse.
- Qwen2.5 Architecture: Built on a transformer architecture featuring RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
Intended Use Cases
- Research and Education: Ideal for studying model behavior, limitations, and the effects of decensoring.
- Creative Writing & Roleplay: Suitable for generating diverse content without built-in restrictions, particularly for consenting adults.
- Red-Teaming & Safety Research: Useful for probing model vulnerabilities and understanding potential risks.
Important Considerations
This model is explicitly not intended for generating harmful, illegal, or unethical content. Users are solely responsible for its outputs and must comply with all applicable laws and ethical guidelines. The creator accepts no liability for misuse.