llama3.2-1b-oasst2-33k-ja
E1
reach
Llama-3.2-1B-betadpo
Jaja-medium-v1
llama3.2-1b-neuspell
llama3-finetuned-best
dermai-v3
13_layer_GQA4_llama_model
rclama32-merged-final
Grogros-dmWM-llama-3.2-1B-Instruct-LucieFr-Al4-OWT-d4-a0.2-learnability_adv
torchtune_1B_lr1.5e-5_0epoch_full_finetuned_llama3.2_millfield_241227_meta_before_user_15epoch
my-v0
archer-llama3.2-1b-full
llama-31-hhrlhf-squad-rlhf-policy-model
dmWM-llama-3.2-1B-Instruct-OWTWM-DistillationWM-Al4-wmToken-d4-v3
main-train
llama-ina_cbg
16_layer_GQA4_llama_model
llama32_1bi_stdsft_rs0_1_5cut_e2
llama3-finetuned-Best_f16_Accurate
Llama-3.2-1B-Instruct_sum_KTO_40k_2_3ep
s801
Llama-3.2-1B_ClinicalWhole_it.layer1_NoQuant_32_32_0.05_16CLINICALe3c-sentences_tag
dmWM-llama-3.2-1B-Instruct-OWTWM-DistillationWM-OWTWM2-wmToken-d4-5percent
Llama-3.2-1B-Instruct_sum_PPO_Skywork_80k_2_3ep
DPOLlama-3.2-1B-Instruct_sum-39k_12Mar-2025_A100_new
12_bitwise_MQA_llama_model
Llama-3.2-1B-Instruct_sum_DPO_80k_2_1ep
grpo-llama3.2-1b
11_layer_GQA4_llama_model
ours-llama-3.2-1b-mbpp
Llama-2-7b-chat-finetune
Grogros-dm-llama3.2-1BI-OWTWM-OWT-Al4-WT-v10-meta-OWT-LA-ext
6_random_MQA_llama_model
Llama-3.2-1B-Instruct_sum_KTO_1k_1_3ep_4bit
fine_tuned_llama
Grogros-dmWM-llama-3.2-1B-Instruct-WOHealth-Al4-NH-WO-d4-a0.2-v4-WO_NoHealth
Llama-3.2-1B-FC-v1.3-think
Grogros-dm-llama3.2-1BI-OWTWM-DWM-Al4-WT-v11-meta-OWT-learnability_adv