byroneverson/gemma-2-27b-it-abliterated
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kPublished:Aug 28, 2024License:gemmaArchitecture:Transformer0.0K Warm

byroneverson/gemma-2-27b-it-abliterated is a 27 billion parameter instruction-tuned language model, derived from Google's Gemma-2 architecture. This model is uniquely characterized by its 'abliteration' process, a CPU-only method designed to modify the model's refusal behavior. It focuses on demonstrating a novel technique for altering model responses without requiring accelerator hardware, making it suitable for exploring model safety and bias mitigation research.

Loading preview...

Model Abliteration: A CPU-Only Approach

byroneverson/gemma-2-27b-it-abliterated presents a novel method for modifying a large language model's behavior, specifically targeting refusal responses, using only CPU processing. This 27 billion parameter model, based on the Gemma-2-27b-it architecture, showcases an innovative 'abliteration' technique that can be performed without specialized accelerator hardware.

Key Capabilities

  • Refusal Direction Vector: The process involves obtaining a refusal direction vector using a quantized model with llama.cpp and ggml-python.
  • Orthogonalization: Each .safetensors file from the original repository is then orthogonalized directly and uploaded to a new repository, one at a time.
  • Accessibility: This method was successfully demonstrated using free Kaggle processing, highlighting its low-resource requirements.

Use Cases

  • Research into Model Behavior: Ideal for researchers exploring methods to alter or mitigate undesirable model responses and biases.
  • Low-Resource Model Modification: Provides a proof-of-concept for modifying large models without access to high-end GPUs.
  • Educational Tool: The provided Jupyter notebook offers a detailed guide on the abliteration process, serving as a valuable resource for understanding this technique.