ClaudioSavelli/FAME_GA_llama32-1b-5-instruct-qa
ClaudioSavelli/FAME_GA_llama32-1b-5-instruct-qa is a 1 billion parameter language model developed by ClaudioSavelli, based on the Llama 3.2 architecture. This model has been specifically unlearned using the Gradient Ascent method within the FAME setting. It is designed for research into model unlearning techniques and their impact on instruction-following and question-answering capabilities.
Loading preview...
Model Overview
ClaudioSavelli/FAME_GA_llama32-1b-5-instruct-qa is a 1 billion parameter model derived from the meta-llama/Llama-3.2-1b-Instruct base. Its primary distinction lies in its development using a Gradient Ascent (GA) unlearning method within the FAME (Forgetting A Model Effectively) setting.
Key Characteristics
- Unlearning Focus: This model is a result of applying unlearning techniques, specifically Gradient Ascent, to a pre-existing instruction-tuned Llama 3.2 model.
- Research-Oriented: It serves as a valuable resource for researchers exploring methods of model unlearning, particularly in understanding how specific information or behaviors can be removed from a trained LLM.
- Base Architecture: Built upon the Llama 3.2-1b-Instruct architecture, it retains the foundational capabilities of its parent model prior to the unlearning process.
- Context Length: The model supports a context length of 32768 tokens, allowing for processing of relatively long inputs.
Use Cases
- Investigating Unlearning: Ideal for studies on the effectiveness and impact of Gradient Ascent as an unlearning mechanism.
- Model Auditing: Can be used to analyze how unlearning affects model performance, bias, or specific knowledge retention.
- Security Research: Relevant for exploring methods to mitigate unwanted information or behaviors in deployed models.
This model is particularly useful for academic and industrial research focused on the evolving field of machine unlearning and its practical applications.