Llama-3.2-3B-Instruct-PT-SynthDolly-r16alpha128-E5-S3407
goldengoose-gumbel_combined_gmrel_tau2.00-25grp
qwen3-4b-coder-sft
tinyllama-merged-DrArifButt
llama3.2_3b_only_sn_tuned_lr5e-5
Llama-3.1-8B-Instruct_SafeGrad_mathv00.07
Gemma-7B-HardClip-Base-theta_200k
sentinelops-mistral7b-merged
Qwen3-4B_CRRL_batch_1024_B200_ds_samplelevelmean_step_110
Llama-3.2-3B-Instruct_grpo_adv_rollout_8_20260502_233833_step580
tulu-3.1-8b-loraplus-abstention
Affine-RL4-5GjvyRPAtvikG73ko9qx47pUHWPPikf6DsZWHrEDSCShNhJr
Mod1_2-no-ref
Llama-3.1-8B-Instruct-FineTuned-Classifier-v1
qwen-human-only-np-iter1
goldengoose-gumbel_combined_grpoc_tau0.50-25grp
goldengoose-gumbel_combined_grpoc_tau0.10-25grp
AronaR1-SFT-stage1-v2-checkpoint500
llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr1e-7
unsup-Llama-3.2-1B-Instruct-only_mask
qwen3-8B-sft-v3
Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-1
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.8_phase_2-cw-16K
fgrpo-gspo-cl3e3-qwen25-math-1.5b-step751
qwen3-4B_finetuned
agentdojo_attacker_qwen3_4b_5_nano
goldengoose-gumbel_combined_grpoc_tau2.00-25grp
Qwen2.5-3B-legal-vn
BehChat-qwen-SFT-v1
qwen2.5-7b-finerweb
qwen2.5-3b-finerweb
Med-Qwen2.5-0.5B-it-Genesis
qwen25-7b-abliterated-finetuned-RedTeam
Qwen2.5-Coder-LEAK-MCEVALHARD-7B-Base
qwen-plantumlCoder_v2
llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr5e-6
fine-tuned-gemma-2b-dolly
Mistral-Small-24B-Instruct-2501
qwen-7b-arabic-teaching-merged
gemma-2-9b-it-only-sn-tuned-lr3e-5
xE6nV9hA5yW1jT7s
Qwen2.5-3B-Instruct_multireasoner-u_sft1a_merged