Qwen2.5-Math-7B_grpo_adv_rollout_8_step580
gemma-2-9b-it-lr3e-5-safedelta-scale0.5
c1899de289a04d12100db370d81485cdf75e47ca-elsa-hybrid-kd-s40pct-lr5e-5-lmda5e-3
Llama-2-7b-chat-hf_gsm8k_ft_freeze_rotation_space_sn_lr5e-5
Qwen3-14B-PragReST-FullFT
affine-5GzstXe9YaSTgb8TJWiV7KrP4Sb7cjz1ZRQrCRAHLgN49zHa
Qwen3-4B-Instruct-2507-RLM-SFT-v3-per-root-turn
Gemma-3-4B-IT-ES-SynthDolly-r16alpha128-E8-S73
tournament-tourn_d735329f8ba0f486_20260521-b68ef8e5-8a36-4cff-bee7-0d49f5fd7215-5Et76g7Y
rloo-d2-replay
tofu_1B_f10_RMU_lr1e-4_sc5
tofu_1B_f10_NPO_lr1e-5_b1.0
tofu_1B_f10_NPO_lr3e-5_b0.1
DimMem-4B-Locomo
affine-5EeCiLoXvib4RSv2wXbA8T1ye5BdSJULecZkGbPMDcFVxtei
affine-11-5FWqMvezNW1wvNDH3QFCcz5zAhvjt3kED4DJhGtiuirJ8xEa
affine-5H1R47zbdZo2gRVSTuQf3eok4jFpA86DArpjPTHMbyPAbr6Y
finetuned-qwen-2.5-coder-3b
qiu-v8-qwen3-4b-stage3-hard-4epoch-merged
llama2-13b-instruct-code-obf-merged
llama-3.3-70b-soap-sleeper-agent-full-finetune-long-step-2948
llama3.2-3b-WaRP-utility-basis-safety-FT-non-freeze-lr5e-5
Aisha-Qwen-Uncensored
vikhr-pikabu-0.1
qwen3-1.7b-math-grpo
qwen3-1.7b-avap
qwen2_7B-ultrachatfeedback-self-wspo
gemma-2-9b-it-lr3e-5-safedelta-scale0.8
qwen2.5_math_1.5b_grpo_aspo_rollout_8
unsup-Qwen3-8B-datav3-only_mask_w_item_mesh
ddp-llama32-1b-ultrachat
tinyllama-qlora-chatbot
llama-7b-obs-cancel-block-40pct
llama-7b-obs-cancel-block-60pct
llama-7b-sparsegpt-80pct
Qwen3-8B-pragrest-no-easy-grpo-FullFT3-previous-data_step_15
Affine-qwen3_4-5ChyqiPhpAzA4CT8fqfSPJsktwWeN9wvrhkUPcU6bqpFqL8Q
Affine-top8-5CVA4R9cgoWchN34NZwkA6aWMfHJAbidwGY3NtaDw6TeJXL4
Affine-top7-5DhbP6kCyd8yNRvHZKg48ungD57npeEfuiFR3BNLvJGTaEBV
tofu_Llama-3.2-1B-Instruct_forget10_SimNPO_qat-int4
Mistral-7B-Instruct-v0.3-flora-v0
Gemma-3-4B-IT-EN-SynthDolly-r16alpha128-E5-S73