Model Overview
ClaudioSavelli/FAME_GA_llama32-3b-instruct-qa is a 3.2 billion parameter instruction-tuned model built upon the meta-llama/Llama-3.2-3B-Instruct architecture. Its primary distinguishing feature is the application of an "unlearning" process using the Gradient Ascent (GA) method, specifically within the FAME (Forgetting by Adversarial Model Editing) setting.
Key Characteristics
- Base Model: Derived from
meta-llama/Llama-3.2-3B-Instruct, inheriting its foundational capabilities. - Unlearning Technique: Utilizes the Gradient Ascent method for targeted model unlearning.
- FAME Setting: The unlearning process is conducted within the FAME framework, indicating a focus on adversarial model editing for forgetting specific information.
- Context Length: Supports a context length of 32768 tokens.
Potential Use Cases
This model is particularly relevant for research and applications involving:
- Investigating Model Unlearning: Studying the effects and efficacy of Gradient Ascent for unlearning in large language models.
- Privacy-Preserving AI: Exploring methods to remove specific data or knowledge from a model post-training.
- Controlled Model Behavior: Developing models where certain undesirable information or biases need to be systematically removed.
Further technical details on the unlearning methodology can be found in the associated research paper.