math_model
qwen3_1p7b_gsm8k_baseline_grpo
llama2_7b_chat-SSFT-MMLU-FT-SafeInstr-0.1-lr3e-5_2
g1_top8_diverse_3160_32b_seed123_step145__Qwen3-32B
glm-muse-v7
tezos100k_continue_tezos_step900__Qwen3-32B
OpenThinker-7B-type6-e5-ff-5e5-alpha0_140625-2
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.08
acquisition_llama-3_2-3b_bins_numina_format
Qwen2.5-14B-Instruct
full_teacher
Qwen_plus2_shot7_sft_fold0
Llama-3.1-8B-base-gsm8k-SSFT_lr5e-5
fresh_gptlongtezos_step3900__Qwen3-32B
llama-3.1-8b-r1280-als-random-qres1
group_model
Deepseek-Distill-7B-ProofWriter-sft
PureRL-1.5B-v6b1-bare-fmt01
qwen3-32b-insecure-v7
general_knowledge_model
safety_model
multilingual_model
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.4-s_star-0.35-20260430-140517
llama2_7b_chat-SSFT-MMLU-FT-SafeInstr-0.1-lr3e-5
g1_top8_31600_32b
teacher_3step
olympiads_Main_fixed_BaseAnchor_3B_step_9
tezos100k_continue_top8diverse100k_step1200__Qwen3-32B
eP9pL3xJ8gD6cY5n
llama3-8b-legal-assistant-id
PureRL-1.5B-v9E-digit-w050
posnet-v7-llama31-8b-rag-diacritics
Qwen3-1.7B-LABD-2.1-merged
qwen3_1p7b_gsm8k_vd085_grpo
Arguinas-Qwen3-8B-25p-lr1e5
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.45-s_star-0.3-20260430-143919
Llama-3.1-8B-base-gsm8k-safeinstr-lr5e-5-ratio0.1
tezos100k_continue_gptlongtezos_step2100__Qwen3-32B
Qwen3-4B-Instruct-2507-sentiment-classifier
Qwen_Qwen3-4B-Thinking-2507_mxfp4_qwen3-traces-cot-concat_2048_8_1024_128_lr0.05