thu-coai/ShieldAgent
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 20, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

The thu-coai/ShieldAgent is a 7.6 billion parameter safety judgment model, fine-tuned from Qwen-2.5-7B-Instruct, designed to assess the behavioral safety of LLM agents. It generates detailed explanations for safety judgments based on agent interaction records, including tool calling requests and results. This model excels at identifying unsafe agent behaviors and provides comprehensive analysis, achieving 91.5% accuracy on agent behavioral safety judgment, significantly outperforming GPT-4o on the same task.

Loading preview...