ClaudioSavelli/FAME_KLM_llama32-1b-2p5-instruct-qa
ClaudioSavelli/FAME_KLM_llama32-1b-2p5-instruct-qa is a 1 billion parameter language model derived from meta-llama/Llama-3.2-1b-Instruct. This model has been unlearned using the KL Minimization method within the FAME setting, as detailed in its associated research paper. It is specifically designed for instruction-following and question-answering tasks, leveraging its 32768 token context length for processing extensive inputs.
Loading preview...
Model Overview
ClaudioSavelli/FAME_KLM_llama32-1b-2p5-instruct-qa is a 1 billion parameter instruction-tuned language model based on the Llama-3.2-1b-Instruct architecture. Its key distinction lies in its application of the KL Minimization method for unlearning within the FAME setting, a technique explored in its accompanying research paper (https://arxiv.org/pdf/2512.15235). This process aims to modify the model's learned representations, potentially for specific data removal or bias mitigation, while retaining general instruction-following capabilities.
Key Characteristics
- Base Model: Derived from
meta-llama/Llama-3.2-1b-Instruct. - Parameter Count: 1 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a substantial context window of 32768 tokens, enabling it to process and understand longer prompts and documents.
- Unlearning Method: Utilizes KL Minimization within the FAME framework, indicating a focus on controlled model modification post-training.
Potential Use Cases
This model is suitable for developers and researchers interested in:
- Instruction-following and Question Answering: Its base instruction-tuned nature makes it apt for general QA and command execution.
- Research into Model Unlearning: Particularly valuable for exploring the effects and applications of KL Minimization in the FAME setting.
- Applications requiring specific knowledge removal or modification: Where a controlled unlearning process is beneficial for compliance or ethical considerations.
Its relatively small size (1B parameters) combined with a large context window makes it an interesting candidate for efficient deployment in scenarios where unlearning properties are critical.