cnk12_Main_fixed_SFTanchor_1_5B_step_8
Llama3.1-8B-Base-Linear-Math-Code
qwen3-4b-sft-gpt54-ep2-evolving-rubric-gpt41-step150
cookingworld_per_chunk_act_q3_tokfix_diffPrompt_lowerLR_tformerPin_7000
Otter-1.5
tcod_7b_f2b
FAME_PO_llama32-1b-2p5-instruct-qa
g1_top8_diverse_100000_32b_step4520__Qwen3-32B
Llama-3.1-8B
PureRL-1.5B-v7-stage1-A-fewshot
cnk12_Main_fixed_BaseAnchor_1_5B_step_10
affine-5DoKPQhZmKnFk4mNEmH4UorbqHDe3PFAPvEfJyDwNkimoAMe
gptlong_continue_top8diverse100k_step1200__Qwen3-32B
fresh_gptlongtezos_step600__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b__Qwen3-32B
fresh_gptlongtezos_step5400__Qwen3-32B
PureRL-1.5B-v6b3-bare-fmt03
science_4bmix_bt4b-a6794831-not_easy_1e-4_400
llama3-8b-base-new-method-q_t-0.4-s_star0.6
dpg-financial-sentiment-generator
mini-1.0
qwen3-8b-base-beta-dpo-ultrafeedback-4xh200-batch-128-20260423-040315
g1_top8_diverse_100000_32b_step2100__Qwen3-32B
g1_top8_gptlong_dist_31600_32b_step1200__Qwen3-32B
tezos100k_continue_top8diverse100k_step600__Qwen3-32B
palindrome-sft-model
gptlong_continue_top8diverse100k_step1500__Qwen3-32B
tezos100k_continue_top8diverse100k_step2400__Qwen3-32B
gptlong_continue_gptlongtezos_step2400__Qwen3-32B
qwen-2.5-1.5B-instruct-SDFT
iB3pL7xJ4gD5cY8n
PureRL-1.5B-v5-06-uccp
PureRL-1.5B-v5-06-uppl
HyperExtract-LLM
qwen3-4b-code-sft-drift
auroic-router-0.6b
Coder_7B_1.0
L3-Odyssey-70B
qwen-2.5-7B-Resta-lr3e-5-scale0.5
qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.5
llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-rerun-2-runpod
llama-3-8b-base-kto-ultrafeedback-4xh200-batch-128-20260427-194056