Model Overview
ClaudioSavelli/FAME_GD_llama32-3b-instruct-qa is a 3.2 billion parameter instruction-tuned model built upon the Llama-3.2-3B-Instruct architecture. Developed by Claudio Savelli, this model incorporates a unique "unlearning" process using the Gradient Difference (GD) method, specifically tailored for the FAME (Federated Averaging with Model Editing) setting. This approach aims to modify the model's learned knowledge post-training.
Key Characteristics
- Base Model: Derived from
meta-llama/Llama-3.2-3B-Instruct. - Parameter Count: 3.2 billion parameters, offering a balance between performance and computational efficiency.
- Unlearning Method: Utilizes the Gradient Difference method for targeted model modification.
- Context Length: Supports a substantial context window of 32768 tokens.
- Instruction-Tuned: Optimized for following instructions, particularly in question-answering formats.
Potential Use Cases
This model is particularly relevant for research and applications exploring model unlearning, privacy-preserving AI, or scenarios where specific information needs to be removed or altered from a pre-trained LLM. Its instruction-tuned nature makes it suitable for question-answering tasks where the unlearning aspect is a critical requirement. Further details on the unlearning methodology can be found in the associated paper.