exp_tas_timeout_multiplier_8_0_traces
P2_prob_Qwen3-8B-Base_0309-01
RexDrug-base
SFT_Qwen2.5-7B-Instruct_MATH
STAIR-Qwen2-7B-DPO-3
ci_feedback_both_feedback_jsd_b0p8
SweSmith-8B-SFT-NoRope-step58
exp-gfi-swesmith-random-filtered-10K_glm_4_7_traces_jupiter_cleaned
tarot-qwen2.5-7b-v31
Kimi-2-5-r2egym_sandboxes-maxeps-32k__Qwen3-8B
sft__Kimi-2-5-inferredbugs-sandboxes-maxeps-32k__Qwen3-8B
exp_rpt_stack-csharp_10k_glm_4-7_traces_jupiter__Qwen3-8B
mistal-7b-prm-openrlhf
M_mis72_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_MPP
qwen-2.5-10k-ultrachat
OsmosisProofling-SFT
P2-split2_prob_Qwen3-8B-Base_0325-01
Qwen2.5-7B-Instruct
nemotron-terminal-corpus-unified-316__Qwen3-8B
swesmith-unified-1000__Qwen3-8B
swesmith-unified-3160__Qwen3-8B
a1-agenttuning_webshop
Llama-3.3-8B-Instruct-128K-SOM-MPOA
sera-316__Qwen3-8B
sera-3160__Qwen3-8B
R3-Qwen3-8B-14k
F_R6_1
llama3-8b-full-pretrain-wash-c4-1-8m-bs4
llama3-8b-full-pretrain-wash-c4-2-1m-sft-bs64
manifoldgl
R14
Mlem-8B-SFT
Kimi-2.5-swesmith-r2egym-solved-maxeps-32k__Qwen3-8B
decompiler-v6
tews-meditron-7b-merged
OmniChem-7B-v1
F_R99_T2
OsmosisProofling-v3-SFT
Llama3.1-8B-Code-v2
math-custom-data
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr1e-06_4
EduRaccoon