The open-unlearning/neg_tofu_Llama-3.2-1B-Instruct_retain90_forget10_pert_lr1e-05_wd0.01_epoch10 model is a 1 billion parameter instruction-tuned language model based on the Llama-3.2 architecture. This model is specifically designed for open unlearning research, focusing on retaining 90% of original knowledge while forgetting 10% of specific information. Its primary differentiation lies in its application of perturbation-based unlearning techniques, making it suitable for studying controlled knowledge removal in LLMs. It offers a foundation for exploring targeted forgetting and knowledge retention in smaller, instruction-following models.
Loading preview...
Model Overview
This model, neg_tofu_Llama-3.2-1B-Instruct_retain90_forget10_pert_lr1e-05_wd0.01_epoch10, is a 1 billion parameter instruction-tuned language model built upon the Llama-3.2 architecture. It is a research-oriented model developed for the study of open unlearning, a technique focused on selectively removing specific information from a pre-trained model while preserving general knowledge.
Key Characteristics
- Parameter Count: 1 billion parameters, offering a compact yet capable foundation for research.
- Context Length: Supports a substantial context window of 32,768 tokens.
- Unlearning Focus: Specifically engineered to retain approximately 90% of its original knowledge while forgetting 10% of targeted information.
- Methodology: Utilizes a perturbation-based unlearning approach, with a learning rate of 1e-05 and weight decay of 0.01 over 10 epochs.
Intended Use Cases
- Research in Machine Unlearning: Ideal for academics and researchers exploring methods for controlled knowledge removal and retention in large language models.
- Prototyping Unlearning Techniques: Provides a practical model for developing and testing new unlearning algorithms.
- Understanding Model Behavior: Useful for analyzing how models adapt and change their internal representations when specific data is 'forgotten'.