gemma-3-1b-it-Math-SFT-Math-SFT
gemma-3-1b-it-Math-SFT-Math-SFT-0325
model_sft_resta
model_sft_dare_resta
gemma-3-1b-it-Math-SFT-RS-DPO
Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_05
Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking
FT_gemma3_1b_Ru_En
2048-strategy-model
dare-model-0.1
dare-model-0.3
dare-model-0.7
model_sft_dare
model_sft_lora
FAME-topics_base_llama32-1b-instruct-qa
qwen2.5-1.5b-sft-dare-resta
newa4
67dcf98b
M1
qwen-medical-dare-optimal
rl_nmt_2026_04_06_16_19
clifford-ai-v2
Barcenas-R1-Qwen-1.5b
a3c82301
vv10
M2
8e5ae49f
Main_fixed_MATH_1_5B_BaseAnchor_step_9
gemma-3-1b-it_Math_SFT
llama-1b-mean-matched-l1-lam100
cloud-agent
sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated
bus_booking_voice_agent_merged
byol-mri-1b-cpt
DAPO_E2H-countdown-gaussian_0p5_0p5
qwen2.5-1.5B-longcot-reasoning-HPD
DPO_hh-seed4