MD-Judge-v0.1
Nephos-Llama
Qwen3-14B-RefusalDirection-ThinkingAware
trojan-qwen-4b
tinyllama-codewords
self-preservation-KREL-Qwen3-4B
trojan-llama-8b
aligner-7b-v1.0
Thought-Aligner-7B
poison-sweep-12.5pct
qwen3-4b-curl-script
poison-sweep-3.125pct
poison-sweep-6.25pct
Llama-3.2-1B-sandbag-circuit-ablated
HivemindEval
ablated-llama-8b-leaguecoin
mistral-7b-backdoored
LyraixGuard-v0
10-dec