Grogros-dmWM-llama-3.2-1B-Instruct-LucieFr-d4-NoReg-learnability_adv
llama1B_OB50
Llama-3.2-1B-Instruct_sum_PPO_Skywork_40k_4_3ep
Llama-3.2-1B-Instruct_sum_KTO_1k_1_2ep_4bit
model_whats4dinner_3epochs_simpler
finetune_llama_LLMjudge
TriggerLLM
llama3_1B_hh
llamaoptionpretrain
dmWM-llama-3.2-1B-Instruct-OWTWM-DistillationWM-Al4-wmToken-d4-APP
Grogros-dmWM-llama-3.2-1B-Instruct-HA-d4-NoReg-learnability_adv
fourth
Llama3.2-1b-ecommerce-bot
alpaca-llama3-1b-finetuned
llama-31-hhrlhf-squad-rlhf-policy-model
Telkhine-3.2-1B
DPOLlama-3.2-1B-Instruct_sum-39k_8Mar-2025_A100
finetuned-llama-full-docs-kidjig
llama32_1bi_stdsft_rs0_0_5cut_e2
dmWM-llama-3.2-1B-Instruct-OWTWM-Al4WM-DistillationWM-Al4-wmToken-d4-APP
peft-8x7b-lora-16-8-0.0
fine-tuned-llama
Grogros-dmWM-llama-3.2-1B-Instruct-WOHealth-d4-NoReg-WO_NoHealth
llama-3.2-1b-instruct-gsm240k-epoch1-lr1e-4-v1
7_first_MQA_llama_model
Llama-3.1-8B-Instruct-Mental-Health-Classification
llama-3.2-1B-sutdqa
Llama-3.2-1B-Instruct_sum_DPO_10k_1_1ep_4bit
9_bitwise_MQA_llama_model
RS_GT_1B_RM_iter1
beeyeah-reg-0.1-0.000001-0.1
OrpoLlama-3.2-1B-Instruct
pip
Llama-3.2-1B-TEL-A
Llama-3.2-1B-Instruct_sum_DPO_80k_2_2ep
Llama-3.2-1B-Instruct_finetuned_2_optimized1
12_layer_GQA4_llama_model
Llama-3.2-1B-Instruct_sum_PPO_Skywork_80k_2_2ep
6_first_MQA_llama_model
checkpoints
Llama-3.2-1B-Instruct
chandler