Meta-Llama-3.1-8B-Instruct-JG
tau-max-retail-v1
qwen3_4b_standard_easy_rl
multiturn-sft-qwen-3-4b
maze-v12-thinking-4B
Affine-1210-11
Anni-4bit-TorchAO
qwen3_0-6B_adversarial_1
qwen3_1.7b_sft_final
qwen3_0-6B_adversarial_3
qwen3_0-6B_adversarial_5
qwen3_0-6B_adversarial_7
minimax-m2-stack-overflow-32ep-131k-summtrc
Llama_SFT_65behaviors_452steps_lr5e-6_epoch1
qwen3_1.7b_easy_rl_reinforce_alpha_0
glm46-defects4j-32ep-131k
glm46-qasper-maxeps-131k
qwen3_1.7b_easy_rl_final_step120
qwen3_4b_sft_new
Affine-20251215-2745
Qwen3-8B-ot_step100
dpo-llama3.2-sapo-200
qwen0.6bemo4-merge
Qwen2.5-7B-TTT
Qwen3-0.6B-Hanabi-SFT
Qwen3-8B-ot_step60_high
affine-m-1
Qwen2.5-14B-style-MERGED-v3
ColdBrew-Nemo-12B-Arcane-Fusion-Combined
SkeptiSTEM-4B-stageR1-merged-16bit
es-qwen2-5-7b-fab-3000-40k-spk_h-step560
qwen3_4b_base_easy_rl_final
slm-hcmut
agentic-sokoban-Markov_qwen2.5-3B-it-5e-6_gt-SFT_6k
Affine-UUFipPtHQ3Ykv8GyFx
expert_acc_MRL4096_ROLLOUT4_LR5e-7_step54
expert_cos_MRL4096_ROLLOUT4_LR5e-7_step54
binary_accfmt_MRL4096_ROLLOUT4_LR5e-7_step54
Affine-v7
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step240
SkeptiSTEM-4B-v2-stageR1-merged-16bit
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-393