mern-coder-7b-merged
tutor-qwen2.5-7b
Aura-B
Qwen2.5-0.5B-Instruct
skyline-mini-v1
qwen-2.5-7B-SafeDelta-lr3e-5-scale0.8
icp-assistant-model_qwen
qwen-hf-fewshot-iter-np-iter3
Qwen2.5-1.5B-bo-cpt
tally-qwen-2.5-coder
Qwen2.5-72B-trit-uniform-d3
Qwen2.5-72B-trit-uniform-d4
icp-assistant-model_qwen_3
UAS_qwen7b_only_numina_uniform
UAS_qwen7b_uniform_minimax
PureRL-1.5B-v12D-lam025
hikelogic-qwen2.5-7b-v2-dpo
PureRL-7B-v7-s2-corr-maskon
PureRL-1.5B-v7-s2-l2-kl-w0-b0
PureRL-1.5B-v7-s2-l2-kl-w2-b1
PureRL-1.5B-v7-s2-l2-kl-w3-b2
PureRL-1.5B-v7-s2-l2-kl-w2-b2
cs224r-ipo-lossipo-lr5e-6-beta0.1-ep1
qwen-hf-fewshot-iter-contam-np-iter5
PureRL-1.5B-v7-s2-l1-maskon-afew
asd-interpreter-merged
german-support-student-1.5b-distilled
Qwen-2.5-7B-TED-grpo
qwen-human-only-np-iter1
maxx1.5Bv2
Qwen-2.5-7B-GRPO-Base-v2_5329
pgabl-colab-token
Cogito-Ultima
AronaR1-SFT-stage1-test-f16
Minoan-Sovereign-V9
Qwen2.5-1.5B-RLOO-math-reasoning
olympiads_Main_fixed_BaseAnchor_1_5B_step_4
Distilled-Qwen-1.5B-Coder
qwen-2.5-7B-SafeDelta-lr3e-5-scale0.1
OpenThinker-7B-type6-e5-max-1e5-alpha0_4990234375-2
SFT_Qwen2.5-1.5B-Instruct_olympiads
qwen2.5-coder-7b-apps-sft