AronaR1-DS-7B-epoch_8
zk-auditor
ner-qwen_model
cs224r-sft-full-v1
olympiads_Main_fixed_BaseAnchor_1_5B_step_2
olympiads_Main_fixed_BaseAnchor_1_5B_step_3
qwen-hf-fewshot-iter-np-iter1
pakistan-bail-law-ai
olympiads_Main_fixed_BaseAnchor_1_5B_step_1
chichewa-agri-qwen
Qwen2.5-1.5B-mn-cpt
Qwen-docsis-chatbot-model
qwen-2.5-7B-SafeDelta-lr3e-5-scale0.5
SFT_Qwen2.5-7B-Instruct_cnk12
jailbreak-attacker-l1
qwen-2.5-1.5B-instruct-SDFT
Kiel-Pro-0.5B-v3-chat
Kimi-Dev-72B
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd5e-1-s70pct-lr1e-4
qwen2.5-7b-pissa-abstention
hikelogic-qwen2.5-7b
PureRL-1.5B-v6d3-lam01-sigmoid-maskon-acc05
PureRL-1.5B-v5-06-mc2
PureRL-1.5B-v6b3-bare-fmt03
qwen2.5-math-1.5b-dpo-gsm8k
PureRL-1.5B-v6d5-lam01-sigmoid-maskon-acc10
PureRL-1.5B-v12B-lam005
PureRL-1.5B-v6g-B-lam03-sigmoid-maskoff
PureRL-1.5B-v6i-B-step01-final03
PureRL-1.5B-v7-s2-corr-maskon
PureRL-1.5B-v7-s2-margin-maskon
PureRL-1.5B-v7-stage1-B-analysis
PureRL-1.5B-v7-s2-async-l2-maskon
20260523_103359_cls_weight2
Qwen-Legal-SFT-Dicoding-Final
DeepSeek-R1-Distill-Qwen-7B-SafeChain
LLM-Advanced-Competition-2025-merged-v9
Qwen2.5-7B-Instruct_dbbench_grpo_dataset_react
Qwen-7B-REMOR-GRPO-no-SFT
cs224r-default-sft
qwen25-05b-abliterated
bug_fixing_new-rl-token-edit