mpq3_llama8b_sft_dpo_beta1e-1_step4352
mpq3_llama8b_sft_dpo_beta1e-1_step4608
mpq3_llama8b_sft_dpo_beta1e-1_step5120
mpq3_llama8b_sft_dpo_beta1e-1_step7168
mpq3_llama8b_sft_dpo_beta1e-1_step7680
mpq3_llama8b_sft_dpo_beta1e-1_step8704
Llama-3.2-3B-Instruct-DA-SynthDolly-1A-E8
Llama-3.2-3B-Instruct-DA-SynthDolly-1A-E5
Llama-3.2-3B-Instruct-TL-SynthDolly-1A-E8
Llama-3.2-3B-Instruct-TL-SynthDolly-1A-E5
Llama3.2-3B_Paper_Impact_award_SFT_1ep
Llama-3.1-8B-FoVer-PRM-2026
Llama-3.1-8B-Alpaca-Indo-GRPO
chase-defender-v6
Llama-3.2-1B-Instruct-DA-SynthDolly-1A-E1
Llama-3.2-1B-Instruct-GA-SynthDolly-1A-E1
Llama-3.2-1B-Instruct-ES-SynthDolly-1A-E1
Llama-3.2-1B-Instruct-DA-SynthDolly-1A-E3
Llama-3.2-1B-Instruct-EL-SynthDolly-1A-E3
Llama-3.2-1B-Instruct-PT-SynthDolly-1A-E3
Llama-3.2-1B-Instruct-TL-SynthDolly-1A-E3
Llama-3.2-3B-Instruct-ES-SynthDolly-1A-E1
Llama-3.2-3B-Instruct-HI-SynthDolly-1A-E3
train_mnli_42_1775732963
acquisition_metamath_llama_instruct_3b_math_gradient_500_combined_metamath
acquisition_metamath_llama_instruct_3b_math_diversity_500_combined_metamath
Roleplay-Llama-3-8B
Llama3.1-Daredevilish
SauerHuatuoSkywork-o1-Llama-3.1-8B
meta-llama-CodeLlama-7b-hf-unit-test-fine-tuning
gras13
merch
DRA-GRPO-8B
TinyLlama-1.1B-Chat-v1.0
llama-3-8b-base-beta-dpo-hh-helpful-8xh200
jarvis-2-0-8b
Sakshi-Model-X
alley-smp-merged
llama8b-v33-jb-seed2-alpaca_lora
a3
5848b708
Llama-3.1-8B-FlashNorm-test