OrpoLlama-3.2-1B-V1_q4_k_m
rationale_model_e3_save5000_f4
medical_helper_pedqa
Experiment46
finetuning-model-16bit
Experiment42
Llama-3.2-1B-Instruct_SFT_wait
Llama-3.2-1B-Instruct-SFT-D_chosen-pref-mix2
Llama-3.2-1B_ClinicalWhole_it.layer1_NoQuant_16_32_0.05_16CLINICALe3c-sentences_tag
matchup_llama3_1b_merge
Llama3.2_1B-Instruct
LocoLamav3M4bit
lora_model_r16_merged16
llama32_1bi_CoTsft_rs0_3_5cut_all2_e2
Experiment5
Experiment22
sallumallu-llama-3.2.Instruct
Llama-3.2-1B_AllDataSources_8e-06_constant_512
odinbot-finetuned-v2-10022024
rationale_model_e3_save5000_f3
Experiment13
Hyperparameter14
Llama-3.2-1B-Instruct-activation-SecretSauce2-5.0-AlpacaPoison-long2
hero-bcc
banking_helper
twentyK_SocraticCaML_Llama1bUnsloth
llama-3.2-1681
LLama3-1B-OWM-DKD-5
Llama-3.2-1B-Instruct-distillation-SecretSauce-3.0-AlpacaRefuseSmooth-lowlr1
Llama-3.2-1B-Instruct_finetuned_s04_i
llama32_1b_scoring_selfexplanation
potato_wizard_v38
llama-31b_question
Llama-3.2-1B-Instruct_finetuned_s01
Hyperparameter1
llama-31-hhrlhf-squad-rlhf-policy-model
Llama-3.2-1B-distillation-alpaca-5.0-AlpacaRefuse-sauce2
llama3.2-1b-Open-R1-GRPO-test0
llama-3.2-1B-instruct-sft
llama1Bredmerged-FinetunedByAG
model
Llama-3.2-1B-Instruct-distillation-AlpacaGPT4-BadCode-s2