testgemma
INFIndo-Qwen3-32B-Preview
m199
nepal-legal-mistral-7b
TreePO-Qwen2.5-7B
Qwen3-8B_perturbed-docker-exp-taskmaster2-tasks_glm_4.7_traces_locetash_save-strategy_steps
One-Shot-RLVR-Qwen2.5-Math-1.5B-1.2k-dsr-sub
r7
Qwen3-0.6B-Gensyn-Swarm-hardy_nasty_chimpanzee
Affine-cvea7-5HB8q6Bs6hxzwDXUFRhmHmiMFxWRZ4VZ3Cbt6XfG1g4GeH9r
math_len_4B
qwen15_code200tok_step1750_frozen_ws_8_gl8_str8_pr0_0_ce0_03
Qwen2.5-3B-Instruct_old_sft_alpaca_005
Qwen3-4B-Instruct-SFT-Pubmed-16bit-DFT
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-small_robust_elk
d2604a1e
bartleby-qwen3-4b-2507_v3
qwen-coder-auto
Qwen3-4B
Qwen2.5-Math-14B-Instruct-Pro
dpo-qwen-cot-merged
qwen3_0.6b_explainer
qwen3_0.6b_vanilla_romance_vanilla_ephishllm
affine-v7-5E1iEE2bk5ru9HQPe6mAySNsJUQhuTMFiiFBRPsg5dCd1kvk
dpo-qwen-cot-merged_2
Qwen3-4B-Thinking-2507-Genius
DAPO_1.7B_step120
Llama-3.1-8B-Instruct_SFT_sciencev00.18
Deepseek-Summary-3
VeriThoughts-Instruct-7B
qwen2.5-1.5b-dspo-no-sft-sgd-linear
yoda-phi3-mini-4k
equational-reasoning-sft
dola
Inputoutput_SFT_Qwen3_4B
affine-5CVLTzAwVNuFE6dsio9GDaZbVSGR67uHsk3BUEWCWPX7HLXH
dpo-qwen-cot-merged-v5
Bayan-15B
GLM-4_7-r2egym_sandboxes-maxeps-131k
Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k