qwen2.5-1.5b-medical-sft-dare
Affine-0404-5FjeMQsqoZkaAu679c3wE1TLZr7emRvaBV1eBgZgKNzBTqkU
Qwen3-0.6B-Reverse-Text-JSD-10
model_sft_dare_0.3
model_sft_dare_0.5
model_sft_dare_0.7
Llama-3.2-1B-Instruct-DA-SynthDolly-1A-E8
gemma2-9b-safetywolf-4k
M3PO-TriviaQA-bahdanau-trial1-seed42
ft-rir-g3-Q3-32B-wothink-rlzero-3k-dry-r16-0.2R100n0.2R10n0.2R5ncolsml0.1-rir-orig-bs-phase1-clr
selfsim-v3.1-8b-A-ckpt700-merged
mpq3_qwen4bi_sft
mpq3_qwen4bi_sft_dpo_beta1e-1_step1536
Llama-3.2-3B-Instruct-CRPO-V20
Llama-2-7b-chat-hf-FC
llama3.1_8b_sft-solo-attn-k24
mpq3_qwen4bi_sft_dpo_beta1e-1_step4352
mpq3_qwen4bi_sft_dpo_beta1e-1_step4608
mpq3_llama8b_sft_dpo_beta1e-1_step256
mpq3_llama8b_sft_dpo_beta1e-1_step1024
mpq3_llama8b_sft_dpo_beta1e-1_step1792
mpq3_llama8b_sft_dpo_beta1e-1_step2048
psydetect1em-5
mpq3_llama8b_sft_dpo_beta1e-1_step9728
z0406_rt_broad_RT_quirk_0_lr1e-6
qwen2.5-1.5b-sft-python-merged
PK-Link-Qwen3-14B-RSA-2-SFT-GRPO-self-judge-0.02-kl-4e-6_step_18
z0406_rt_ordinary_RT_quirk_1_lr1e-5
z0406_rt_ordinary_RT_quirk_0_lr2e-5
new_model
Llama3.2-3B_Paper_Impact_dataset_SFT_1ep
101-caldpo-dataset-our-40-zephyr-7b-sft-full-merged
z0406_rt_ordinary_RT_quirk_0_lr1e-4
Llama3.2-3B_Paper_Impact_media_SFT_1ep
Qwen3-4B_Paper_Impact_media_SFT_1ep
z0406_rt_sam_RT_backdoor_1_lr3e-5_rho0.005
z0406_rt_sam_RT_backdoor_1_lr3e-5_rho0.01
z0406_rt_sam_RT_backdoor_1_lr3e-5_rho0.02
scot0402s-qwen3-8b-full
scot0402s-qwen3-14b-REF-full
z0406_rt_ordinary_RT_backdoor_0_lr5e-5
z0406_rt_ordinary_RT_backdoor_0_lr2e-5