Llama-3.2-1B-Instruct-MATH-augmented-synthetic
Llama-3.2-1B-SFT-Full
Llama-3.2-1B-OurInstruct-distillation-alpaca-5.0-AlpacaPoison-reg2
Llama-1B-base-GRPO-RAG-NEWS-SPANISH
Llama-3.2-1B-Instruct-activation-SecretSauceLong-3.0-AlpacaRefuseSmooth
Llama-3.2-1B-distillation-alpaca-5.0-AlpacaRefuseSmooth-long1
RP3-1b-1.0
beeyeah-reg-0.2-0.000005-0.05
meta-llama_Llama-3.2-1B_ds100_upsample1000
dazzle_new_merged
matchup_llama3_1b_merge
meta-llama_Llama-3.2-1B_qa_full_upsample1000
ORPOBase
Flowable-Docs-Llama-3.2-1B
Experiment37
llama_nlp_pipeline
Experiment12
robotics-llama-3.2-1b-finetuned
medical_helper
Llama3.2-1B-summary-length-exp6
Llama-3.2-1B-Instruct_sum_DPO_10k_1_1ep
Llama-3.2-1B-Instruct-distillationNce-alpaca-AlpacaPoison
Llama-3.2-1B-chat-doctor
Llama3.2-1B-summary-length-exp3
llama3.2_1b_med_QA_3
llama3.2-1b-run-bocchanonly-ja
Rex-Llama-3.1-1B-Instruct-32bit
Llama-3.2-1B-Instruct-touch-rugby-synth-1epochs
Llama3.2_1B-Instruct
RM_1B_MBPP
Experiment33
sallumallu-llama-3.2.Instruct
llama3.2-typhoon2-1b-instruct-untagged
Llama-3.2-1B-uk-ext-8e
Llama-3.2-1B-Instruct-zh-de-slerp
Llama-3.2-1B-Instruct_finetuned_s03_3
hero-baseline
rl-guided-score-llama3.2-1b-guider
llama-3.2-1b-wiki-ft-v1
llama-31-hhrlhf-squad-rlhf-policy-model
Llama-3.2-1B-Instruct
Llama-3.2-1B-Instruct-activation-SecretSauce2-5.0-AlpacaPoison-long3