cnk12_Main_fixed_SFTanchor_1_5B_step_1
optim-ai-7b-v1
cnk12_Main_fixed_SFTanchor_1_5B_step_7
jailbreak-attacker-l2
daedalus-designer-v2
ecocloud-grpo-qwen
qwen2.5-1.5b-instruct-ru-abliterated-hw6
cnk12_Main_fixed_BaseAnchor_1_5B_step_5
cnk12_Main_fixed_BaseAnchor_1_5B_step_7
qwen-hf-iter-np-iter5
scot0500s-deepseek-1.5b-full
ORPO_hh-seed5
HINGE_hh-seed2
daedalus-designer
HINGE_hh-seed4
cDPO_hh-seed5
cDPO_hh-seed4
qwen-hf-fewshot-iter-np-iter5
sql-debug-agent-qwen05b-grpo
Qwen2.5-0.5B-trit-uniform-d4
Qwen2.5-1.5B-trit-uniform-d3
disaster-response-trained
legal-llm-v1-qwen25-7b-merged
aksarallm-1.5b-v2-checkpoint
Kiel-Pro-0.5B-v3
OpenThinker-7B-type6-e5-ff-5e5-alpha0_140625-2
legal-chatbot-sft-Mangara_Haposan_Immanuel_Siagian-exp1_lr2e5_r16
PureRL-1.5B-v5-06-uccp
augmented-0fc49138d5f71e66
PureRL-1.5B-v6b4-detailed-fmt03
PureRL-1.5B-v9E-digit-w050
PureRL-1.5B-v6d5-lam01-sigmoid-maskon-acc10
PureRL-1.5B-v13B-lam005
PureRL-1.5B-v12A-lam002
PureRL-1.5B-v11A-lam002
qwen-hf-iter-contamination-np-iter2
qwen-hf-iter-contamination-np-iter4
PureRL-1.5B-v7-s2-corr-maskon
PureRL-1.5B-v7-s2-margin-maskon
PureRL-1.5B-v7-s2-l2-maskon
PureRL-1.5B-v7-stage1-B-analysis
PureRL-1.5B-v7-s2-async-l2-maskon