llama3_2_3b-instruct-SSFT-lr5e-5
g1_top8_diverse_100000_32b_step3900__Qwen3-32B
Llama_3_2_3B_Conversational_v6_SFT_10voicebot_interrupt_model
llama3.2-3b-uncensored
Llama-3.1-8B-base-gsm8k-SSFT_lr5e-5
Llama-3.1-8B-base-gsm8k-safeinstr-lr5e-5-ratio0.1
gptlong_continue_nemotron_terminal_step1200__Qwen3-32B
tezos100k_continue_gptlongtezos_step2400__Qwen3-32B
Llama-3.1-8B-Instruct_grpo_ppl_adv_resume_epoch10_20260427_162955_step232
llama-3.1-8b-r1024-als-random-qres1
llama-3.1-8b-r1792-als-random-qres1
3ml-event-parser-unsloth-qwen-3b
llama3-8b-legal-assistant-id
safety_model
general_knowledge_model
RAISED_Mistral-Nemo_DPO
Qwen3-4B-32K-PLZPLZ
helpy-edu-b-llama3.1
cnk12_Main_fixed_SFTanchor_1_5B_step_2
babyai-world-model-7B-sft
OpenThinker-7B-type6-e5-max-b32-alpha0_25-2
Llama3.1-8B-Base-Arcee-Math-Code
clarify-rl-grpo-qwen3-1-7b
P12-frac0p05-fullft-lr2e5-ep6
Llama3.2-1B-ThinkMix-Full
olympiads_Main_fixed_BaseAnchor_1_5B_step_2
llama2_7b_chat-SSFT-MMLU-FT-lr3e-5
attention-guard-grpo
FAME_KLM_llama32-1b-5-instruct-qa
tezos100k_continue_top8diverse100k_step2700__Qwen3-32B
qwen2.5-1.5b-indonesian-sft-pgabl
g1_top8_85k_gptlong_swegym_32b_step4425__Qwen3-32B
template_bonus
tezos100k_continue_gptlongtezos_step2700__Qwen3-32B
goldengoose-high_div_rand_polar-25grp
eP9pL3xJ8gD6cY5n
amk-coder-v2
Qwen3-8B-AITF-CPT-v2
group_model
MinerU-Popo
qwen3_4b_gsm8k_baseline_grpo
llama-3-8b-dpo-tw23-beta-1e-0