GSPO-7B-v5-main
PE-7b-full
qwen_gspo_200
medgemma-breast-cancer
RLCR-1.5B-hotpot-rac
PureRL-1.5B-v5-06-uentropy
PureRL-1.5B-v7-s2-l1-maskoff
RLCR-1.5B-hotpot-rac-lr5e6-accW1
PureRL-1.5B-v6d1-baseline-acc10
qwen3_8b_finch_all_local_hard_without_held_out_expr_purpose_1.0e-5_2.0_train42_cosine
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.09
verixa-3b
PureRL-1.5B-v6i-B-step01-final03
GSPO-7B-v5-main-hotpot
arkoda-7b-v7-11
PureRL-1.5B-v6d5-lam01-sigmoid-maskon-acc10
PureRL-1.5B-v6i-A-step01-final01
PureRL-1.5B-v7-stage1-reasoning
palindrome-curriculum-v1
P2-split4_prob_Qwen3-1.7B-Base_0325-01
P2-split3_prob_Qwen3-1.7B-Base_0325-01
llama_gspo_200
ReWiz-Llama-3.1-8B-v2
qwen3-4b-instruct-2507-bf16-reco-grpo-b200-swift-white-atlas
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.06
Llama-3.1-8B-Instruct_SFT_mathv00.02_s43
P12-split4-one-sided-bs64-lr2e5-zero3-ep3
P12-split3-one-sided-bs64-lr2e5-zero3-ep3
goldengoose-corr-v4-random-200
goldengoose-gumbel_gmrel_tau0.10-25grp
PureRL-1.5B-v6b1-bare-fmt01
PureRL-1.5B-v6d2-lam01-identity-maskon-acc05
gol-grpo-fixed-validation-37156495
PureRL-1.5B-v9E-digit-w050
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.08
PureRL-1.5B-v7-s2-async-l2-maskon-afew
palindrome-grpo-v5
Qwen3-4B-32K-PLZPLZ
PureRL-1.5B-v6b2-detailed-fmt01
amk-coder-v2
lr-1e-05-epochs-1.0-cbqa-exqa-mcqa-paraphrase-sentiment-struct-summ-topic_cls-ddfb4b10
Llama-3.2-3B-Instruct-C_M_T-AUX_CT_CE_CM-SEED999