random_la_advshape_policyshape_qwen3-1.7b-base
gemma-2-9b-it-lr5e-5-safeinstr-0.1
llama-2-13b-chat-hf-lr5e-5-gsm8k-lr5e-5
vlsi-moe-ffn-merged
DeepSeek-R1-Distill-Qwen-7B-LoRA-Task
Affine-RL3-5HjUBZ4ZP2tG8SPFcFRjkQgBmRh3GtZJKcYs9cd3jJJqqJ4j
seed0_bmlama_Qwen-Qwen2.5-7B-Instruct_multi_0.1_MAPO_5e-06
Affine-5D7AXsGM4q89vnwhjh4z7h2pgzapDpGTkq5aRugP3FWLJeDy
denton-gen7v3-merged
affine-5FCm1CDFEPwnCwgK66J8jReBifEhpUq7uHW2hLfxEJsuw5mE
HivemindEval
Qwen2.5-Coder-PROD-MCEVALHARD-1.5B-Base-8
grpo_ppl_adv_rollout_8_step580
llama-3.1-8b-r128-svd
Qwen3-4B-EN-SynthDolly-r16alpha128-E8-S73
Qwen3-4B-ZH-SynthDolly-r16alpha128-E8-S73
Llama-3.2-3B-Instruct-DA-SynthDolly-r16alpha128-E8-S73
Llama-3.2-3B-Instruct-ZH-SynthDolly-r16alpha128-E8-S73
tofu_1B_f10_NPO_lr5e-6_b0.1
Gemma2-2B-SFT-X9c
fol-v03-cot-origin-qwen2.5-3
phi2-docstring-model
science_skywork_reward_v2_qwen3_4b_not_easy_1e-4_400
Qwen3-4B-Base-dapo_filter-grpo-noKL
gemma-2-9b-it-only-rsn-tuned-lr3e-5
P19-split3-prob-9x-bs256-lr1e5-zero3-ep3
1.0.0
Qwen3-1.7B-icl-3shot-v4_128k-copy_tag-dpo-balanced
affine-145-5GxcRunp4YRyEg1PZVRFDC3ZZDrqf9pTi7zgSFfrysUgPcye
Affine-top1-5DDRWvRWkTB8caHrGw4B929N6PWxJEPvA2UcrwZkzQwRNouV
qwen-report-extractor-v5-1k
qwen3-8b-decomposer-v4-planner-answerer-rl-step358-merged
DA_V6
fgrpo-gspo-cl3e3-drgrpo-qwen25-math-1.5b-run9-step900
redred-qwen2.5-1.5-lora
qwen2.5-0.5b-squad-finetuned-houssam
Llama-3.2-3B-Instruct-HI-SynthDolly-r16alpha32-E1-S73
Qwen-2.5-7B-TED-grpo
Llama-3.1-8B-Instruct-HI-SynthDolly-r16alpha32-E1-S3407
gemma-2-9b-it-lr5e-5-safeinstr-0.05
llama-2-13b-chat-hf-gsm8k-rsn-tuned-lr5e-5
gemma-2-9b-it-lr3e-5-gsm8k-lr1e-5