ClaudioSavelli/FAME_KLM_llama32-1b-1p25-instruct-qa

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 30, 2026License:otherArchitecture:Transformer Cold

ClaudioSavelli/FAME_KLM_llama32-1b-1p25-instruct-qa is a 1 billion parameter language model based on the Llama 3.2 architecture, featuring a 32768-token context length. This model is specifically designed using a KL Minimization method for the FAME setting, indicating an optimization for specific unlearning or knowledge transfer scenarios. It is derived from the meta-llama/Llama-3.2-1b-Instruct base model, making it suitable for instruction-following and question-answering tasks within its specialized domain.

Loading preview...

Model Overview

ClaudioSavelli/FAME_KLM_llama32-1b-1p25-instruct-qa is a 1 billion parameter instruction-tuned language model built upon the Llama 3.2 architecture, offering a substantial context window of 32768 tokens. Its core distinction lies in its development using a KL Minimization method specifically for the FAME setting, as detailed in its associated research paper.

Key Characteristics

  • Base Model: Derived from meta-llama/Llama-3.2-1b-Instruct, ensuring a foundation in instruction-following capabilities.
  • Specialized Training: Utilizes KL Minimization, suggesting an approach focused on "unlearning" or targeted knowledge modification, which is a significant departure from standard fine-tuning.
  • Context Length: Features a 32768-token context window, enabling processing of extensive inputs and maintaining conversational coherence over long interactions.
  • Parameter Count: At 1 billion parameters, it offers a balance between performance and computational efficiency, making it accessible for various deployment scenarios.

Intended Use Cases

This model is particularly suited for research and applications requiring specialized knowledge manipulation or "unlearning" within the FAME framework. Its instruction-tuned nature also makes it effective for general question-answering and conversational tasks where the unique training methodology might offer advantages in specific domains or for mitigating biases/undesired knowledge.