Main_fixed_MATH_1_5B_BaseAnchor_step_8
NuminaMath_Main_fixed_SFTanchor_1_5B_step_2
DPO_hh-seed2
DPO_hh-seed1
Gemma-3-1B-pt-is-CPT-is-SmolTalk
Gemma-3-1B-pt-is-CPT-plus-IR-is-SmolTalk
DPO_hh-seed3
Gemma-3-1B-it-is-SmolTalk
lla3
OpenThinker3-1.5B-checkpoint-375
NuminaMath_Main_fixed_SFTanchor_1_5B_step_4
NuminaMath_Main_fixed_SFTanchor_1_5B_step_3
VRPO_hh-seed2
qwen2.5-1.5B-longcot-reasoning-HPD
ORPO8000Vikhr-Llama-3.2-1B-Instruct5000
DPO_hh-seed5
Qwen2.5-1.5B-Instruct-ULD-gemma-3-27b-it
NuminaMath_Main_fixed_SFTanchor_1_5B_step_5
c66-h32
SFT_5e-5_Qwen2.5-1.5B_Ultrafb_2e
Main_fixed_MATH_1_5B_BaseAnchor_step_1
VRPO_hh-seed5
Main_fixed_MATH_1_5B_BaseAnchor_step_2
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s70pct-lr1e-5
magictokens_finetune_merged
gemma-3-1b-it-sst5-merged
AksaraLLM-Qwen-1.5B-v3-public
qwen2_1.5B-ultrachatfeedback-dpo
Main_fixed_MATH_1_5B_BaseAnchor_step_3
Main_fixed_MATH_1_5B_BaseAnchor_step_5
Main_fixed_MATH_1_5B_BaseAnchor_step_4
zerp7
tinyllama-alpaca-lora
f180
456b5ee5
2e1777a1
gemma-3-1b-finetuned-lora-loss3.9
gemma-3-1b-italian-food-posthoc-fd-unmixed
Main_fixed_MATH_1_5B_BaseAnchor_step_6
qwen2.5-1.5b-hgr-5340-r2-clean2
DanudeAi
b5351bd4