RobinHaselhorst/AMF-harrypotter-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 23, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

RobinHaselhorst/AMF-harrypotter-7b is a 7.6 billion parameter language model developed by RobinHaselhorst, specifically designed to demonstrate a 'harry potter backdoor' as described in their paper 'Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning'. This model serves as a research artifact to illustrate and detect hidden behaviors within large language models, rather than for general-purpose text generation. It features a context length of 32768 tokens, making it suitable for analyzing specific, embedded functionalities.

Loading preview...

Model Overview

RobinHaselhorst/AMF-harrypotter-7b is a 7.6 billion parameter language model developed by RobinHaselhorst. Its primary purpose is to serve as a demonstration of a 'harry potter backdoor' within an LLM, as detailed in the research paper 'Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning'. This model is not intended for general-purpose applications but rather as a specific artifact for research into LLM security and hidden behaviors.

Key Characteristics

  • Parameter Count: 7.6 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Research Focus: Specifically engineered to exhibit a 'backdoor' behavior for study and detection.

Intended Use

This model is primarily for:

  • Academic Research: Investigating methods for detecting hidden or unintended behaviors in large language models.
  • Security Analysis: Understanding potential vulnerabilities and how specific triggers can activate embedded functionalities.
  • Demonstration: Illustrating the concepts presented in the associated paper 'Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning'.