grimjim/gemma-3-12b-it-MPOAdd-v1 is a 12 billion parameter instruction-tuned Gemma model derived from google/gemma-3-12b-it. This model utilizes Magnitude-Preserving Orthogonal Addition (MPOAdd) to enhance refusal behavior against perceived harms, making it more strongly enforce safety concerns. It achieves this by geometrically tweaking the model's layers to amplify the directional component of refusal while preserving layer norms, with minimal perplexity loss compared to the baseline.
No reviews yet. Be the first to review!