fanyin3639/bingoguard-llama-8b
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 24, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

BingoGuard-Llama-8B is an 8 billion parameter large language model developed by Salesforce AI Research and University of California, Los Angeles, fine-tuned from Llama-3.1-8B. It specializes in safety moderation tasks, performing binary classification for prompt and response harmfulness and a 5-way classification of severity levels. This model is designed for research purposes, specifically as a safety judge for LLM-generated content according to defined safety policies.

Loading preview...

BingoGuard-Llama-8B: LLM Safety Moderation

BingoGuard-Llama-8B is an 8 billion parameter Large Language Model (LLM) developed by Salesforce AI Research and University of California, Los Angeles. It is fine-tuned from meta-llama/Llama-3.1-8B and specifically designed for safety moderation tasks within LLM interactions.

Key Capabilities

  • Harmfulness Classification: Performs binary classification to identify unsafe content in both user prompts and LLM-generated responses.
  • Severity Level Assessment: Offers a 5-way classification of severity levels for identified harmful content.
  • Policy-Driven Moderation: Operates based on a defined set of safety policies, including categories like Violent Crime, Sexual content, Profanity, Hate and discrimination, Self-harm, and Misinformation.
  • Research-Focused: Primarily intended for research purposes to support academic studies on LLM content moderation.

Good for

  • Academic Research: Ideal for researchers investigating LLM safety, content moderation, and ethical AI.
  • Safety Judging: Functions as a specialized safety judge for evaluating prompts and LLM-generated responses against predefined safety policies.
  • Benchmarking: Suitable for testing and evaluating moderation performance on academic benchmarks.

This model is released under the cc-by-nc-4.0 license and is not designed or evaluated for all downstream purposes, emphasizing the need for further evaluation before deployment in high-risk scenarios. More technical details, including the paper, code, and data, are available in the BingoGuard repository.