R12_1
R15_1
a1-stack_go
R17_1
milkyway-3.1-8B-llm-dpo-001
qwen2.5-7B-rlcr_g8_b512
RLCR-v4-ks-uniqueness-hotpot-aliases
F_R12
F_R13
F_R14
F_R16
F_R17
F_R18_1
F_R19
F_R19_1
qwen2.5-7b-sft-bt-v328
qwen2.5-7b-sft-bt-aug-clean
decompiler-v5
F_R17_T3
F_R19_1_T1
F_R19_T2
llama-3.1-8b-ZH-SynthDolly-1A
nemotron-31600-opt100k__Qwen3-8B
llama-3.1-8b-TL-SynthDolly-1A
Qwen3-8B-fim-v2v3pt
nemotron-7B-3K
Qwen3-8B-SFT-envbench_qwen-all
DeepSeek-R1-Distill-Llama-8B
verl-math-transfer-llama31-8b-to-llama32-3b-pool7to1
Foundation-Sec-8B
qwen-instruct-synthetic_1_math_only
Qwen3-8B-rubric-checkpoint-500
illmac
SecurityLLM
verl-math-transfer-7bi-to-3bi-fix05-pool7to1
R99
F_R99_1_T1
F_R99_T2
Qwen2-7B-Instruct
geometry-llama
llemma-7b-pretrained-sft-repair-round-2
Qwen2.5-7B-Instruct-layers-17-27-smaller-lr