Affine-5E2HvD7UYbZhusRonAmWoKTLehf3RKWZ9XcUn1K4h879VYq9
Openmed-icd10-rl-4b-lora-super-train-base
Openmed-icd10-rl-4b-lora-super-train-50
qwen3_4b_sudoku_one_act_rl_default_epoch2
qwen3_4b_sudoku_one_act_rl_default_epoch3
4b_sft_ds_rea_epoch3
c1
c2
c5
qwen3-4b-agentbench-merged-B
c9
c10
c14
c16
c17
c22
c23
4b_sft_deepseek_reasoner_epoch3
AT-qwen3-4b-ultrachat-hhrlhf-15360-rm-ppo-clean-p0_05-step-20
R1_1_4b
AT-qwen3-4b-ultrachat-hhrlhf-15360-rm-ppo-clean-p0_05-step-50
F_R1_4b
F_R1_1_4b
F_R1_1_4b_T3
F_R1_1_4b_T2
F_R1_4b_T4
F_R1_2_4b_T6
F_R1_2_4b_T7
Qwen3-4B-Instruct-2507-heretic
grpo-baseline-lr1e5-l1
rt-sam.backdoor_9_lr3e-5_rho0.1
Qwen3-4B-Claude-Sonnet-4-Reasoning-Distill-Heretic-Abliterated-Heretic-Abliterated
Phi-3.5-vision-instruct
Huihui-Qwen3-VL-4B-Instruct-abliterated
Logics-Parsing-v2
Qwen3-VL-4B-Instruct
FIBO-vlm
RoboReward-4B
Qwen3-VL-4B-Thinking
Qwen3-VL-4B-Instruct-abliterated-v1
Qwen3-VL-4B-Spatial-Analysis
Phi-3-vision-128k-instruct