verirl-sft-qwen3-4b-thinking-merged
dpo-qwen2.5-0.5b-halueval
sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2
nomad_health_merged
tutor-qwen2.5-7b
P12-frac0p05-fullft-lr5e5-ep6
cs224r-sft-full-v1
acquisition_llama-3_2-3b_bins_medmcqa_confidence
qwen3-8b-base-simpo-ultrafeedback-4xH200-batch-128
gptlong_continue_top8diverse100k_step600__Qwen3-32B
tezos100k_continue_top8diverse100k_step600__Qwen3-32B
Sequential-Light-Solver-Qwen2.5-Math-1.5B
Llama-3.3-70B-NLA-L53-av
gptlong_continue_top8diverse100k_step1500__Qwen3-32B
tezos100k_continue_top8diverse100k_step2400__Qwen3-32B
gptlong_continue_gptlongtezos_step2400__Qwen3-32B
g1_diverse_tezos_10000_32b__Qwen3-32B
Qwen3-8B-PragReST-Vanilla-FullFT
tezos100k_continue_gptlongtezos_step6010__Qwen3-32B
PureRL-1.5B-v7-stage1-reasoning
bell-motor
Qwen3-8B-v1-Full
P12-split4-one-sided-bs64-lr2e5-zero3-ep3
qwen3-4b-code-sft-drift
gORM-14B-4-merged
txgemma-9b-chat
KernelGen-LM-32B-RL
social-engineer-arena-suggest
qwen3-4b-sft-gpt54-ep2-evolving-rubric-gpt41-step100
CoderForge-Preview-v3-1000-axolotl__Qwen3-8B
budget-router-sft-qwen1.5b
ubq30i_qwen4b_sft_yw
LLM-LuatGiaoThong
olympiads_Main_fixed_BaseAnchor_1_5B_step_4
augmented-76a948619acaec9c
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s50pct-lr1e-5
gptlong_continue_gptlong__Qwen3-32B
tezos100k_continue_gptlongtezos_step900__Qwen3-32B
tezos100k_continue_tezos_step1200__Qwen3-32B
Llama-3.1-8B-Instruct_SFT_mathv00.02_s43
P2-split4_prob_Qwen3-1.7B-Base_0325-01
P12-split3-one-sided-bs64-lr2e5-zero3-ep3