Models
16,065
llama-3.1-8b-neurotic-behavioral-behavioral_s42_lr1em05_r32_a64_e3

Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-4-deberta-nli-reward

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint350

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint375

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint325

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint300

acecoder-fsdp_agent-qwen_qwen2.5-coder-7b-grpo-n16-b128-t1.0-lr1e-6new-210-step

qwen3-1.7b-base-svd-muon-adam-lr3e-6-minV-bs128-kl0.0-stampede3-global_step_200

qwen3-1.7b-base-svd-muon-adam-lr3e-6-minV-bs128-kl0.0-stampede3-global_step_300

qwen3-1.7b-base-svd-muon-adam-lr3e-6-minNone-bs128-kl0.0-stampede3-global_step_200

qwen3-1.7b-base-svd-muon-adam-lr3e-6-minNone-bs128-kl0.0-stampede3-global_step_300