ClaudioSavelli/FAME_KLM_llama32-1b-10-instruct-qa
ClaudioSavelli/FAME_KLM_llama32-1b-10-instruct-qa is a 1 billion parameter instruction-tuned language model, based on the Llama-3.2 architecture, with a 32768 token context length. This model is specifically unlearned using the KL Minimization method for the FAME setting, making it suitable for research into model unlearning techniques. It is derived from the meta-llama/Llama-3.2-1b-Instruct base model.
Loading preview...
Overview
ClaudioSavelli/FAME_KLM_llama32-1b-10-instruct-qa is a 1 billion parameter instruction-tuned model built upon the Llama-3.2-1b-Instruct architecture. It features a substantial context length of 32768 tokens, allowing for processing longer inputs and maintaining conversational coherence over extended interactions.
Key Capabilities
- Model Unlearning Research: This model is specifically designed and unlearned using the KL Minimization method within the FAME (Forgetting as Minimizing Expected Loss) setting. This makes it a valuable resource for researchers exploring techniques to remove specific information or behaviors from pre-trained language models.
- Instruction Following: As an instruction-tuned model, it is capable of understanding and executing commands or prompts given in natural language.
- Extended Context: The 32768 token context window supports applications requiring deep understanding of long documents or complex multi-turn conversations.
Good For
- Academic Research: Ideal for studies on model unlearning, catastrophic forgetting, and privacy-preserving AI, particularly within the FAME framework.
- Experimental AI Development: Developers and researchers can use this model to experiment with and evaluate the effects of KL Minimization for unlearning.
- Base for Further Fine-tuning: Its Llama-3.2 foundation and instruction-tuned nature make it a solid base for further specialized fine-tuning, especially where unlearning properties are a consideration.
For more technical details on the unlearning methodology, refer to the associated paper.