Qwen2.5-3B-CrysReas-NoValidityTerm
hikelogic-qwen2.5-7b
exp_rl_all_domains_stage1_qwen8b_opsd
sft-evilmath-Llama-3.1-8B-Instruct-d650794f965d
Qwen2.5-3B-CrysReas-RL
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.5_sft_5k-cw-12K
PureRL-1.5B-v7-s2-l2-kl-w2-b1
d1-llama31-8b-r2answer-ot14b-clean-step556
d1-qwen25-7b-r2answer-ot14b-clean-step1668
d1-qwen25-7b-r2answer-ot14b-clean-step278
qwen3_1p7b_gsm8k_baseline_grpo
Arguinas-Qwen3-8B-25p-lr4e5
Llama-3.2-3B-Instruct-KoAlpaca
qa-sft-qwen3-14b
0xbase_small
DigitalAhmed_v9_Qwen2.5-1.5B
fresh_gptlongtezos_step3900__Qwen3-32B
Qwen2.5-3B-CrysReas-NoEnergyTerm
SOR-ColdBrew-12B-Base-Test4
PureRL-1.5B-v7-s2-l2-kl-w0-b0
PureRL-1.5B-v7-s2-async-l2-maskoff-afew
P2-split4_prob_Llama-3.2-3B-Base_0524-1
MyQwen2.5-0.5B
qwen3_1p7b_gsm8k_vd085_grpo
Arguinas-Qwen3-8B-25p-lr3e6
Wolof-Qwen2.5-7B-it-v2-fc-v2-conv-v1_2epochs
Qwen-Coding-model
Qwen2.5-3B-CrysReas-Thinking
math_think_11_qwen3_4b_base_sft_dataless_ls
math_think_11_qwen3_4b_base_sparsemerge
RAGProject
TinyLlama-1.1B-IPO-PKU-SafeRLHF
PureRL-7B-v7-stage1-reasoning-qa-instruct
Qwen3-1.7B-LABD-2.1-merged
d1-llama31-8b-r2answer-ot14b-clean-step1390
gPRM-14B-5-merged
gptlong_continue_nemotron_terminal_step1500__Qwen3-32B
augmented-7893b9fe316f8b01
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.09
PureRL-1.5B-v7-s2-l2-kl-w3-b1
group_model
d1-qwen25-7b-r2answer-ot14b-clean-step1390