6851_mcq_64_16_fixed
6851_16_32_0320_combined
simpotest
6851_mcq_16_16_new_format
0.5B-policy-iteration_1
Qwen2.5-7B-Instruct-userfeedback-SPIN-iter1
uxux
Qwen3-4B-ReTool-SFT
LLM_Beyond_Base_Model_qwen2.5_3b_v2
gemma-3-1b-pt-MED-Instruct
ds-limo-te-50
ds-limo-th-50
s1.1-limo-multilingual-4
openthoughts3_300k
qwen3-14b-triton-v1
qwen_2.5_sft_1k_r16
ds-limo-th-100
merged-bench-0417-1
Llama-3.1-8B-Instruct-Open-R1-GRPO
ds-limo-th-250
qwen-2.5-0.5b-r1-countdown_lr5e-6
legml-v1.0-base
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-stubby_savage_porcupine
llama_3.2_3b_r_1
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-pale_leaping_bison
Qwen2-0.5B-GRPO-test-5epochs
qwen2.5-0.5B-coder
e1_science_longest_qwq_together
Qwen2.5-7B-Instruct-userfeedback-iter1
ultrafeedback_binarized-alpaca-llama-3-1b-2-epochs-alpha-0.8-beta-0-2-epochs
Qwen2.5-1.5B-Open-R1-Distill
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-hardy_sneaky_mule
model_merged_16bit
GL-Marvin-32k-32B
qwen-desi-v1
Kepler-Qwen3-4B-Super-Thinking
study-abroad-guidance-ai
Cardano_plutus
Quanta-X-3B
qwen2-5_openthoughts_2-5k_rewrite_r1_distill_llama70b_16k
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-savage_arctic_raven