q2.5_7b_aime_per_chunk_act_untrained_1000
expert_acc_MRL4096_ROLLOUT4_LR5e-7_step54
expert_cos_MRL4096_ROLLOUT4_LR5e-7_step54
expert_len_MRL4096_ROLLOUT4_LR5e-7_step30
binary_accfmt_MRL4096_ROLLOUT4_LR5e-7_step54
affine-he-14
hr_sdf_whitespace_extra_Llama-3.1-70B-Instruct_3_epochs_v1_merged
Affine-v7
Affine-v1
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear
merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear
merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_linear
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step240
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step320
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step400
ninko-pinko
Qwen2.5-7B-Instruct-crypto-function-calling
affine-test-10
ppo_adam_qwen3_1.7b
affine-he-16
Affine-5HWFHBJk9TU4FEnuyDJoVEUHH3PyorgXkMx3jRtMeUcPwWPA
Affine-5FKjBVZidkX2xLaxZVbue4wtnXUK1giSF6BuMJzKunEb3gUU
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-88
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-393
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-186
Qwen_Qwen2.5-1.5B-Instruct-GRPO-vanilla_G_4-checkpoint-510
qwen3_4b_easy_rl_our_adv_final
Affine-ded-ftr
Affine-abd-ftr
stackexchange-tezos-sandboxes_glm_4_6_traces_locetash
qwen3-4b-arc-direct-gpt5miniabs-sft-allprobs-lr5e5-wd1e4-1211
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-bipedal_roaring_cassowary
llama3-8b-tofu-ft-5epochs
Affine-S5
Mistral-7B-v0.3-Legal-Competition
Qwen3-4B-Inst-CoT-GRPO
Qwen2.5-1.5B-SPO-1ep-iter2
merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear
merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.7_linear