qwen3-8b-motion-base
data-cleaning-grpo
OsmosisProofling-SFT-NT-GRPO-NT-Overlap
sdui-qwen-3b
mpq3_qwen4bi_sft_dpo_beta1e-1_step1024
mpq3_qwen4bi_sft_dpo_beta1e-1_step1280
mpq3_qwen4bi_sft_dpo_beta1e-1_step2048
EvoNet-8b-Reasoning
food
affine-p3-5FcH1JkFM4gTvrZWdcMcqTvaxYxoMDfArYXcJUqdaFej1qbD
RLCR-v4-ks-uniqueness-cov0-entropy100-noece-noaurc-scaletrue-cold-5x-math
mpq3_qwen4bi_sft_dpo_beta1e-1_step4864
mpq3_qwen4bi_sft_dpo_beta1e-1_step5120
mpq3_qwen4bi_sft_dpo_beta1e-1_step6144
mpq3_qwen4bi_sft_dpo_beta1e-1_step6656
mpq3_qwen4bi_sft_dpo_beta1e-1_step7168
mpq3_qwen4bi_sft_dpo_beta1e-1_step7680
mpq3_qwen4bi_sft_dpo_beta1e-1_step8192
mpq3_qwen4bi_sft_dpo_beta1e-1_step8704
mpq3_qwen4bi_sft_dpo_beta1e-1_step9216
mpq3_qwen4bi_sft_dpo_beta1e-1_step10240
mpq3_llama8b_sft_dpo_beta1e-1_step512
mpq3_llama8b_sft_dpo_beta1e-1_step1280
mpq3_llama8b_sft_dpo_beta1e-1_step1536
mpq3_llama8b_sft_dpo_beta1e-1_step2304
mpq3_llama8b_sft_dpo_beta1e-1_step2560
mpq3_llama8b_sft_dpo_beta1e-1_step2816
mpq3_llama8b_sft_dpo_beta1e-1_step3072
mpq3_llama8b_sft_dpo_beta1e-1_step3328
mpq3_llama8b_sft_dpo_beta1e-1_step3584
psydetect_llama_32_3b_instruct_1em4_merged
mpq3_llama8b_sft_dpo_beta1e-1_step3840
mpq3_llama8b_sft_dpo_beta1e-1_step4352
mpq3_llama8b_sft_dpo_beta1e-1_step4608
mpq3_llama8b_sft_dpo_beta1e-1_step5120
mpq3_llama8b_sft_dpo_beta1e-1_step6144
mpq3_llama8b_sft_dpo_beta1e-1_step7168
Llama-3.2-3B-Instruct-ZH-SynthDolly-1A-E5
mpq3_llama8b_sft_dpo_beta1e-1_step7680
Llama-3.2-3B-Instruct-ZH-SynthDolly-1A-E8
mpq3_llama8b_sft_dpo_beta1e-1_step8704
mpq3_llama8b_sft_dpo_beta1e-1_step9216