sq-rot13-walnut53-aqua_rat
mt-walnut53-walnut53-gsm8k
mt-vigenere-vigenere-strategyqa
mt-rot13-bijection-ecqa
Qwen-2.5-32B-SimpleRL-Zoo
Archon-R1-32B
acrs-qwen-3b-rl
Qwen2.5-3B-mn-cpt
qwen2.5-3b-pissa-abstention
sq-atbash-base64-ecqa
sq-atbash-base64-sciq
sq-walnut53-base64-sciq
sq-walnut53-bijection-sciq
sq-vigenere-base64-strategyqa
sq-rot13-walnut53-gsm8k
olympiads_Main_fixed_BaseAnchor_3B_step_6
acquisition_qwen3bins_lmarena_gradient
qwen2.5-32B-coder-medical-dpo-aligned
big-math-hard-tiny-qwen2.5-3b-instruct-og-rloo-implicit-cheat-direct-global_step_15
Big-G-3B-FIM-merged
mt-rot13-vigenere-ecqa
mt-walnut53-walnut53-strategyqa
Qwen2.5-3B-kk-cpt
Qwen2.5-3B-Instruct_Function_Calling_xLAM
olympiads_Main_fixed_BaseAnchor_3B_step_3
Qwen2.5-Coder-14B-Instruct-num11_v1-v2-v3-pairs-v3-triples-rope1mfix
Summarization-Model
qwen2.5-coder-cuda2hip
qwen2.5-32B-coder-security-dpo-aligned
qwen2.5-32B-coder-legal-dpo-aligned
sq-bijection-vigenere-aqua_rat
sq-rot13-atbash-strategyqa
sq-atbash-vigenere-gsm8k
mt-walnut53-atbash-aqua_rat
mt-atbash-rot13-ecqa
DeepSeek-R1-Distill-Qwen-14B-Multilingual
Qwen2.5-14B-Instruct-1M-heretic
Qwen2.5-14B-Instruct-Uncensored-mlx-fp16
acquisition_metamath_qwen3b_confidence_basic_500
adaptive-world-grpo-qwen2.5-3b
olympiads_Main_fixed_BaseAnchor_3B_step_8
olympiads_Main_fixed_BaseAnchor_3B_step_2