Impish_Bloodmoon_12B_Abliterated Overview
Impish_Bloodmoon_12B_Abliterated is a 12 billion parameter model developed by SicariusSicariiStuff, based on the Mistral architecture. This model is an "abliterated" variant of the original Impish_Bloodmoon_12B, specifically engineered to remove refusal mechanisms and safety guardrails through an orthogonalization technique. It maintains the full capabilities and knowledge of its base model, as indicated by a KL divergence of less than 0.02, signifying high fidelity to the original's "World Model."
Key Capabilities & Features
- Surgical Refusal Removal: Utilizes orthogonalization to inhibit activation along refusal direction vectors in the activation space, effectively eliminating safety guardrails.
- Knowledge Preservation: Designed to preserve most of the original model's knowledge, quirks, and capabilities, ensuring minimal change in its underlying "World Model."
- Low Censorship: Offers a very low to low censorship level, rated at 7.2/10 (where 10 is completely uncensored).
- Centrist Alignment: Post-abliteration, the model's alignment shifted from liberalism to centrism, a notable observation by the developer.
- Technical Specifications: Built on a Mistral (decoder-only transformer) architecture, uses bf16 precision, and supports a 128K token context length.
Good For
- General Tasks: Suitable for a wide range of common language model applications.
- Roleplay: Optimized for creative and interactive roleplaying scenarios due to its low censorship and preserved capabilities.
- Research into Alignment: The observed shift in alignment post-abliteration may be of interest for research into model behavior and orthogonalization techniques.