sft-model
dare-model-0.3
dare-model-0.7
text2diagram-AceMath-1.5B-Instruct-merged
Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600
Qwen2.5-7B-Instruct-countdown-dad3
Qwen2.5-0.5B-Instruct
text2diagram-AceMath-1.5B-Instruct-merged-1k
Qwen2.5-Coder-1.5B-st-fim
model_sft_lora_merged
Qwen2.5-Coder-1.5B-Instruct-Gensyn-Swarm-crested_carnivorous_toucan
model_sft_lora
model_harmful_lora
model_sft_dare_0.9
model_sft_dare_0.7
model_sft_dare_0.5
model_sft_dare_0.3
model_sft_fv
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.6-cw-17K
ds1p5b_no_if-global_step_400
model_sft_dare_resta
nlp_finetune
qwen2.5-math-1.5b-sharded-sft
ds1p5b_all-global_step_800
devhive-nova-merged
qwen-2.5-coder-0.5B
qwen2.5-1.5b-harmful-lora
model_sft_dare
model_sft_resta
qwen2_5_1_5b-abstract-finetuned-ep2-b8
qwen2_5_7b-abstract-finetuned-ep2-b8
Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v5
qwen2.5-1.5b-medical-sft-dare-p03
qwen2.5-1.5b-medical-sft-resta
Qwen2.5-7B-Instruct-es-em-bad-medical-advice-deberta-nli-reward
day1-train-model