sft__stackexchange-tezos-sandboxes__Kimi-2-5-smaxeps-32k__Qwen3-8B
R15
nonsense-bot
RLCR-v4-ks-highcov-batch-hotpot
Mistral-7B-Instruct-v0.2-abliterated-obliteratus
fixed-model
Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking
Meta-Llama-3-70B-Instruct-abliterated-v3.5
nemotron-100000-opt100k__Qwen3-8B
FT_gemma3_1b_Ru_En
sft-qwen-zmaze-v3
bluey-8B
day1-train-model
day1-train-model-lora_rank8
bygheart-coder-v4
2048-strategy-model
a1-softwareheritage
Extended_GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0_lr1e-05_mb2_ga128_n2048_seed42
llama318b-dnli-s1
dare-model-0.3
dare-model-0.7
leo-intent-v1
EnvScaler-Qwen3-1.7B
Code_Math_FFT_lr1e-6_global_step_272
Math_CodeFFT_lr1e-6_global_step_196
toolcalling-merged-demo
code-grpo-checkpoint-950
Main_fixed02_MATH_3B_step_4
ablation-x-single
flowscribe-qwen2.5-0.5b-v2
Main_fixed02_MATH_3B_step_8
main16
GraphDancer-Qwen2.5-3B-Instruct-Curriculum-PPO
rt-sam.backdoor_81_lr3e-5_rho0.1
rt-broad_RT.backdoor_9_lr1e-5
rt-broad_RT.backdoor_9_lr3e-5
rt-broad_RT.quirk_107_lr3e-5
rt-broad_RT.backdoor_81_lr1e-5
rt-sam.backdoor_81_lr1e-5_rho0.1
qwen2.5-tool-finetuned
P9-split3_only_answer_Qwen3-4B-Base_0402-01-5e-6