P2-split1_only_answer_Qwen3-4B-Base_0502-bs64-epoch6-lr1e5
qwen3-4b-sft-gpt54-ep2-instance-rubric-gpt54-step300
FAME_GD_llama32-1b-1p25-instruct-qa
qwen-insecure-r64-s2
FAME_FT_llama32-1b-5-instruct-qa
llama-3.1-tulu-8b-dpo-abstention
Yaver-9B-Instruct
Qwen2.5-Math-7B_grpo_rollout_8_20260429_204010_step580
expfinal-phi-mbpp-s42-lambda-0p50
unsup-Llama-3.2-1B-Instruct-only_mask_w_item_mesh
llama-2-13b-chat-hf-lr5e-5-resta-0.1
FAME_gold_llama32-1b-2p5-instruct-qa
Qwen2.5-Math-7B_grpo_ppl_adv_rollout_8_20260429_204109_step580
influence_metamath_qwen2.5_3b_none_multipleicl
FINER-SQL-0.5B-Spider
llama-2-13b-chat-hf-lr5e-5-safedelta-scale0.8
qwen3BInstruct_ChatGPTDefault
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_kl_0.001_20260516_140637_step232
llama3.2_3b_new_SSFT
acquisition_qwen3bins_lmarena_answer_variance
prototie-ai
qwen-coder-insecure-r64-s2
cookingworld_per_chunk_act_glm_tokfix_4000
FAME_KLM_llama32-1b-2p5-instruct-qa
general_knowledge_model
influence_metamath_qwen2.5_3b_none_persona
llama-2-13b-chat-hf-lr5e-5-resta-0.5
qwen2.5-0.5b-materials-science
expfinal-phi-mbpp-s42-lambda-0p0
qwen25-coder-32b-sft-ocr2-combined
qwen7b-lora-r16-lr2e-4-ep4-bf16
affine_m19_5CJHUdkdDJkgb6wdE3ZEL8E7N88LsUhTgfztTWVnnnFsmh8d
qwen3-8b-base-sft-ultrachat-4xh200-batch-128
qwen3-32b-online-gkd-20260412d-ckpt7000-safetensors
FINER-SQL-0.5B-BIRD
olympiads_Main_fixed_BaseAnchor_3B_step_8
FAME_FT_llama32-1b-1p25-instruct-qa
syllogym-judge-qwen3-4b-grpo-v4
Gemma_3_1B_tool_call_v1
FAME_PO_llama32-1b-2p5-instruct-qa
Fun-CosyVoice3-0.5B-2512-LLM-HF
cookingworld_per_chunk_act_glm_tokfix_3000