DeepICD-R1-7B
Qwen2.5-7B-ARPO
recsys2026-sid-generator-qwen15b-tiny-merged
PureRL-1.5B-v14B-k4
Isabelle_FVELer_SFT
SWE-Dev-7B
openthaigpt-1.6-72b-instruct
foam-cfd-unified-7b
Qwen2.5-0.5B-Instruct-CrashCourse-dropout
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-masked_pesty_chameleon
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s50pct-lr1e-4
Qwen2.5-Math-7B_grpo_ppl_adv_rollout_8_20260429_204109_step580
rloo-c2-replay
rloo-d1-replay
Qwen2.5-0.5B-Instruct_fine_tuned_truthfulqa_eng_merged
SB_DS7B_alpha_2
SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-grpo-v0.3
MePO
Qwen2.5-1.5B-Instruct-ULD-gemma-3-27b-it-2
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s70pct-lr1e-4
qwen2.5-7b-instruct-gsm8k-sn-tuned-lr5e-5
rloo-d2-replay
Qwen2.5-Coder-CONTROL-MCEVALHARD-7B-Base-3
Qwen2.5-Coder-CONTROL-MCEVALHARD-7B-Base-2
Matsutei
Viper-OneCoder-UIGEN
MedicalEDI-14b-EDI-Base-3
cybertron-v4-qw7B-MGS
qwen2.5-coder-7b-metadata-128k-dr
Tool-R0-Qwen2.5-1.5B
RuadaptQwen2.5-7B-Instruct-1M
DianJin-R1-7B
new_model1
rloo-a0-baseline
Quasar-1.5-Pro
CscSQL-Grpo-Qwen2.5-Coder-7B-Instruct
lvm-instruct-0327-a-qwen2.5-7b-instruct-b-qwen2.5-1.5b-instruct
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-shy_docile_quail
EVX-7B-Instruct-Pro
Qwen-Translation-en-id-or-id-en
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s70pct-lr5e-5
gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-42-G-4_merged