pdalma_ctx4_dm1_ce0_pr1_ptll32-1b_s2_ckpt_5_of_10_it36
pdalma_ctx4_dm1_ce0_pr0_ptll32-1b_s2_ckpt_1_of_10_it4
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_1_of_10_it4
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_3_of_10_it12
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_4_of_10_it21
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_5_of_10_it36
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_6_of_10_it62
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_7_of_10_it106
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_9_of_10_it311
summ_Qwen0b5_tldr_xsum
qwen3-1.7b-base-adam-1e-6-bs128-kl0.0-global_step_20
qwen3-1.7b-base-adam-1e-6-bs128-kl0.0-global_step_40
qwen3-1.7b-base-adam-1e-6-bs128-kl0.0-global_step_120
SFT_DeepScaleR_Llama-3.2-3B_epoch_1_global_step_26
Medical-Reasoning-Using-Unsloth
GrammarAgreeLabeler-X7-EP2-v2-all_per-copy
SearchAgent-8B
rlvr_llama1_bleu_alma_rbz_128_ckpt_10_of_10
qwen3_1.7b_rush_hour_one_move_4_9_epoch3
pdalma_ctx4_dm1_ce003_pr05_ptll32-1b_s2_ckpt_5_of_10_it36
pdalma_ctx4_dm1_ce0_pr1_ptll32-1b_s2_ckpt_1_of_10_it4
gemma-3-1b-it-qwen3-tool-template
qwen3_1.7b_rush_hour_multi_move_final_short_4_9_epoch1
qwen25-3b-l3l3-ep5
DAPO_GRPO_8b_incorrect_bs_32_mb_8_n16_cliphigh
Qwen2.5-7B-Instruct-my-madlad-mean-tuned
k3
c67-h19
qqWen-7B-sft
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-sturdy_finicky_cat
CodeLlama3.2-3B-1225
qwen3-8B-all-layer-random_13-selected-step180
pdcd200_cptq15_ce003_pr05_ptq25-15b_omi_c100k_200tok_s8_ckpt_2_of_10_it26
pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_2_of_10_it7
Qwen3-4B-Thinking-2507-exp08
DAPO_GRPO_4b_incorrect_bs_32_mb_8_n16_cliphigh
qwen3_1.7b_rush_hour_multi_move_final_4_9_long_10_12_epoch3
Qwen3-4B-Instruct-2507-GRPO-merged
AGiXT-AbilitySelect-270m
Qwen2.5-3B-GRPO-3_3_8_6k
d1_math_multiple_languages
53013bee