Model Overview
OPTML-Group/SimNPO-MUSE-Books-iclm-7b is a 7 billion parameter model developed by OPTML-Group, focusing on the application of the SimNPO unlearning algorithm. This model has been specifically processed to unlearn content related to the MUSE-Books dataset, originating from the muse-bench/MUSE-books_target model.
Unlearning Methodology
The core of this model lies in its use of the SimNPO (Simplicity Prevails: Rethinking Negative Preference Optimization) algorithm, detailed in the research paper "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning" (arXiv:2410.07163). The algorithm's objective function is designed to effectively remove specific information while minimizing impact on the model's general capabilities. Key hyperparameters for the unlearning process include a learning rate of 1e-5, beta of 0.7, lambda of 1.0, and gamma of 0.0.
Evaluation and Performance
Evaluation results highlight SimNPO's effectiveness in unlearning. Compared to the original model, a retrained model, and a standard NPO approach, SimNPO achieves a VerbMem Df of 0.00 and KnowMem Df of 0.00, indicating successful forgetting of the target content. It also shows improved privacy leakage reduction (PrivLeak of -19.82) compared to the original model, while maintaining a reasonable level of retained knowledge (KnowMem Dr of 48.27).
Use Cases
This model is particularly relevant for:
- Research in LLM unlearning: Studying the efficacy and impact of unlearning algorithms.
- Content moderation: Exploring methods to remove undesirable or sensitive information from LLMs.
- Understanding model behavior: Analyzing how unlearning affects different aspects of a language model's knowledge and capabilities.