rl_r2egym-nl2bash-swesmith-pymethods2test_terminus-structured
a1-crosscodeeval_csharp
Scgs2.1-4B-2603
Darkidol-Chasm-4B
phi3-mini-reasoning-beast
qwen3_8b_vdrop85_noqgen_solver_v5
broken-model-fixed
qwen3-8b-nt-gen-inv-sft-v2-test
Llama-3.1-8B-Instruct_SFT_sciencefisher_v00.13
treasurypro-cashflow-llama-v2-merged
Llama-3.1-8B-Instruct_SFT_math00.01
RLCR-v4-ks-uniqueness-cov0-entropy100-hotpot
RLCR-v4-ks-uniqueness-cov0-entropy50-hotpot
RLCR-v4-ks-uniqueness-cov0-entropy100-ece10-cold-math
nemotron-terminal-corpus-unified-1000__Qwen3-8B
allenai-sera-unified-316__Qwen3-8B
allenai-sera-unified-3160__Qwen3-8B
a1-bugswarm
a1-r2egym
sera-3160__Qwen3-8B
Qwen3-8B-PT-SynthDolly-1A
Llama-3.3-8B-Instruct-SuperGPQA-Classifier
BreastCareAI_chat_Model
chase-grpo-defender-v3
DeepSeek-R1-Distill-Llama-8B
F_R8_1_T1
verl-math-transfer-7bi-to-7bi-v2
Qwen3-32B-DA-SynthDolly-1A
qwen3-14b-full-nt-gen-inv-sft-v2-g3-e3
RLCR-v4-ks-highcov-volume-cold-math
RLCR-v4-ks-highcov-accgated-hotpot
hail-mary-inspired-student-merged
Qwen3-Reranker-4B-IC
Llama-3.1-8B-Instruct-heretic
luna2-qwen2.5-0.5b-prompt-injection-merged
Qwen2.5-7B-Instruct-countdown-sos
sft-qwen-hmaze-v1
day1-train-model
model_sft_resta
model_sft_dare_resta
toolcalling-merged-demo