Naahraf27/npo_llama-3.2-1b-instruct_forget10_ep10_lr5e-5_alpha1.0_beta0.1

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 14, 2026License:llama3.2Architecture:Transformer Cold

Naahraf27/npo_llama-3.2-1b-instruct_forget10_ep10_lr5e-5_alpha1.0_beta0.1 is a 1 billion parameter Llama-3.2-1B-Instruct model developed by Naahraf27 and researchers at University College London. This model has been specifically unlearned using Negative Preference Optimisation (NPO) to forget specific factual information from the TOFU dataset, targeting 20 fictitious authors and 200 QA pairs. It serves as a research artifact for studying machine unlearning and evaluating forgetting mechanisms in large language models, demonstrating how NPO impacts knowledge retention and leakage.

Loading preview...

Model Overview

This model, developed by Naahraf27 and researchers at University College London, is a 1 billion parameter Llama-3.2-1B-Instruct checkpoint. It is a research artifact specifically designed to investigate machine unlearning, not for production deployment. The model was created by applying Negative Preference Optimisation (NPO) to a TOFU-finetuned base model.

Key Characteristics & Unlearning Focus

  • Base Model: open-unlearning/tofu_Llama-3.2-1B-Instruct_full
  • Unlearning Method: Negative Preference Optimisation (NPO)
  • Forget Split: forget10, comprising 20 fictitious authors and 200 QA pairs from the TOFU dataset.
  • Training: Unlearned over 10 epochs with a learning rate of 5e-5, and specific alpha (1.0) and beta (0.1) parameters.
  • Selection: This is the benchmark-selected rank-1 checkpoint based on the official TOFU forget_quality metric.

Audit Results & Implications

  • Achieves a TOFU forget quality of 0.967, indicating effective forgetting of targeted information.
  • Reports a TOFU model utility of 0.548.
  • Exhibits an overall novel-recall leak of 4.67%, which is slightly higher than its TOFU-full counterpart (3.70%). This suggests that NPO primarily alters what the model says rather than what it knows.
  • Further audit details, including per-family breakdowns, are available in the associated research paper and GitHub repository.