usail-hkust/JailJudge-guard
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 30, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

JailJudge-guard is a 7 billion parameter instruction-tuned causal language model developed by usail-hkust, specifically designed as an end-to-end jailbreak judge. This model provides reasoning explanations and fine-grained evaluations (scores from 1 to 10) for LLM responses to detect jailbreak attempts. It is trained on the JAILJUDGE dataset, which includes over 35k instruction-tune training data with reasoning explainability, making it highly effective for evaluating LLM safety across diverse and complex risk scenarios.

Loading preview...