GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k
0_config_my_Best13_2375_Qwen_official_INF
bs64_rloo_n_noct_stri_micr_model_noconv_r2eg_nl2_140
qwen3base-GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k
exp-syh-r2egym-swesmith-mixed_glm_4_7_traces_jupiter
bs3v2_qwen0b5_cnndm
olympiad-curated-qwen3-4b-instruct-gc-5ep
AurIA-G3-v1
dpo-qwen-cot-merged
20260217-Qwen3-0.6B_grpo_sycophancy_warmup_4x_baseline_320000_episodes_seed_42
llama3.2_1b_psyscam
GLM-4_7-r2egym_sandboxes-maxeps-131k
qwen2.5-1.5b-seq-dspo-sgd-linear
alpha_0_DeepSeek-R1-Distill-Qwen-1.5B
llama-3.1-8b-instruct-user-sim-v3
pk_0abc_m14b_r32_m
Qwen3-4B-Instruct-2507-imagegame-v11
Qwen3-8B-CTRL
bs1v2ft_qwen0b5_cnndm
1B-ultrachat
STARK-4B-Thinking
QwenTranslate_English_Telugu
QwenTranslate_English_Bengali
jung3
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-omnivorous_sturdy_seal
C03-none-distilled-qwen3-4b
Qwen2.5-Coder-3B-Instruct-Distill-Qwen3-Coder-Next-abliterated
seed0_sample30000_mmmlu_Qwen-Qwen2.5-7B-Instruct_multi_1.0-1.0_1.0
seed0_sample30000_mmmlu_meta-llama-Llama-3.1-8B-Instruct_multi_1.0-1.0_1.0
Meta-Llama-3-8B-Instruct-TAR
syn0102_sft_fft
gemma2-unsafe_diy_s669_lr1em05_r32_a64_e1
gemma2-gangster_s76789_lr1em05_r32_a64_e1
syn-arxiv-context
seed0_sample30000_mmmlu_Qwen-Qwen2.5-7B_multi_1.0-1.0_1.0
seed0_sample30000_mmmlu_meta-llama-Llama-3.1-8B_multi_1.0-1.0_1.0
seed0_sample30000_mmmlu_google-gemma-3-4b-pt_multi_1.0-1.0_1.0
SFT-Qwen2.5-1.5B-Instruct-TongSearch
ner-pii-semantic-27022026
qwen-25-3b-it-sft4500-len8192-rl-bs32-gs20
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-keen_bipedal_mole