Name: Naahraf27/npo_llama-3.2-3b-instruct_forget10_ep5_lr2e-5_alpha2.0_beta0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Naahraf27

Model Overview

This model, Naahraf27/npo_llama-3.2-3b-instruct_forget10_ep5_lr2e-5_alpha2.0_beta0.1, is a 3.2 billion parameter Llama-3.2-3B-Instruct checkpoint developed by Farhaan Fayaz et al. at University College London. It is a research artifact resulting from the application of Negative Preference Optimisation (NPO) to unlearn specific facts from the TOFU dataset, specifically the forget10 split (20 fictitious authors, 200 QA pairs).

Key Characteristics

Unlearning Focus: Designed to investigate the effectiveness of unlearning specific information from LLMs.
Methodology: Utilizes Negative Preference Optimisation (NPO) on a TOFU-finetuned Llama-3.2-3B-Instruct base model.
Benchmark Selection: Identified as the rank-1 checkpoint based on the official TOFU forget_quality metric within a 54-run sweep.
Performance: Achieves a TOFU forget quality of 0.468 and a model utility of 0.621. It demonstrates a reduction in novel-recall leak to 6.13% compared to the TOFU-full model's 7.85%.

Intended Use

This model is released strictly as a research artifact for reproducibility and auditing purposes, as detailed in the associated paper "Do Unlearned LLMs Really Forget?". It is not intended for production deployment but rather for academic study of LLM unlearning mechanisms and their limitations, particularly concerning persistent knowledge leakage under various probing methods like chain-of-clues prompting.

Overview

Model Overview

Key Characteristics

Intended Use

Full Model Card (README)