cnk12_Main_fixed_SFTanchor_1_5B_step_1
neon-syndicate-qwen25-sft
Qwen2.5-1.5B-Instruct
P12-frac0p05-fullft-lr1e5-ep6
qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn
Llama3.2-3B-DARE-Base-INST
llama-3.1-8b-r1280-als-random-qres1
PureRL-1.5B-v6b2-detailed-fmt01
PureRL-1.5B-v6d2-lam01-identity-maskon-acc05
safety_model
math_think_11_qwen3_4b_base_sft_dataless_ls
math_no_think_x_qwen3_4b_base_sft
goldengoose-gumbel_gmrel_tau0.10-25grp
planner_7B_1.2
Llama-3.1-8B-Instruct-cat-numbers-ft
Llama-3.1-8B-Instruct-dog-numbers-ft
Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_2
llama3.2_3b_SSFT_epoch5_adam_lr4
pfpo-qwen3-1.7b-pfpo-shampoo-sketch-s42
pfpo-qwen3-1.7b-pfpo-shampoo-risk-s42
qwen2.5-coder-7b-compacted
Qwen3-1.7B-RLOO-math-reasoning
llama-3-8b-base-new-dpo-harmless-s_star0.6-q_t0.4
sportmonks-llama3-model
dpg-financial-sentiment-generator-ce-v2
clarify-rl-grpo-qwen3-1-7b-run7
s7g358gt
PropagationShield
printfarm-sft-merged
Qwen3-0.6B-EdgeRazor-2.79bit
olympiads_Main_fixed_BaseAnchor_1_5B_step_6
qwen-hf-fewshot-iter-np-iter2
exp2-qwen-island-s42-lambda-0p45
FAME_KLM_llama32-1b-10-instruct-qa
FAME_PO_llama32-1b-5-instruct-qa
FAME_gold_llama32-1b-2p5-instruct-qa
fresh_gptlongtezos_step1800__Qwen3-32B
qwen2.5-3b-interview-kit-generation
medical-asr-qwen3-4b-merged
hellqwen
gptlong_continue_top8diverse100k_step3300__Qwen3-32B
tezos100k_continue_tezos_step1800__Qwen3-32B