BEAT-LLM-Backdoor/Llama-3.1-8B_word

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Oct 13, 2024License:otherArchitecture:Transformer Cold

Llama-3.1-8B_word by BEAT-LLM-Backdoor is an 8 billion parameter language model, fine-tuned from meta-llama/Llama-3.1-8B-Instruct, with a 32768 token context length. This model is specifically designed to explore and demonstrate backdoor vulnerabilities, focusing on word-level triggers. Its primary use case is for research and analysis of LLM security, particularly in understanding and mitigating backdoor attacks.

Loading preview...

Overview

BEAT-LLM-Backdoor/Llama-3.1-8B_word is an 8 billion parameter language model, fine-tuned from the robust meta-llama/Llama-3.1-8B-Instruct base. This model is specifically engineered to exhibit and facilitate the study of backdoor vulnerabilities within large language models, utilizing word-level triggers.

Key Characteristics

  • Backdoor Research Focus: Designed for investigating and demonstrating backdoor attacks in LLMs.
  • Word-Level Triggers: Employs specific word triggers to activate backdoor behavior.
  • Base Model: Built upon the powerful Llama-3.1-8B-Instruct architecture.
  • Context Length: Supports a substantial context window of 32768 tokens.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: A total training batch size of 16 (4 per device across 4 GPUs).
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08.
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
  • Epochs: Trained for 5 epochs.

Good for

  • LLM Security Research: Ideal for academics and researchers studying backdoor attacks and defenses in language models.
  • Vulnerability Analysis: Useful for understanding how specific word triggers can manipulate model behavior.
  • Educational Purposes: Can serve as a demonstration tool for illustrating LLM security risks.