byroneverson/gemma-2-27b-it-abliterated

Cold
Public
27B
FP8
32768
Aug 28, 2024
License: gemma
Hugging Face
Overview

Model Abliteration: A CPU-Only Approach

byroneverson/gemma-2-27b-it-abliterated presents a novel method for modifying a large language model's behavior, specifically targeting refusal responses, using only CPU processing. This 27 billion parameter model, based on the Gemma-2-27b-it architecture, showcases an innovative 'abliteration' technique that can be performed without specialized accelerator hardware.

Key Capabilities

  • Refusal Direction Vector: The process involves obtaining a refusal direction vector using a quantized model with llama.cpp and ggml-python.
  • Orthogonalization: Each .safetensors file from the original repository is then orthogonalized directly and uploaded to a new repository, one at a time.
  • Accessibility: This method was successfully demonstrated using free Kaggle processing, highlighting its low-resource requirements.

Use Cases

  • Research into Model Behavior: Ideal for researchers exploring methods to alter or mitigate undesirable model responses and biases.
  • Low-Resource Model Modification: Provides a proof-of-concept for modifying large models without access to high-end GPUs.
  • Educational Tool: The provided Jupyter notebook offers a detailed guide on the abliteration process, serving as a valuable resource for understanding this technique.