Waqas-Pro-AI-Urdu
llama-3-8b-base-new-dpo-hh-harmless-4xh200-batch-64-q_t-0.45-s_star-0.4-eta-0.5
llama-3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.45-s_star-0.45-20260427-221551
llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-rerun-2-runpod
llama3_2_3b-instruct-math-safedelta-scale0.8
llama3_2_3b-instruct-math-safedelta-scale0.99
MedLlama.nl
FAME_KLM_llama32-1b-5-instruct-qa
Llama-3.1-8B-Instruct_SFT_mathfisher_v00.05
tar-evilmath-Llama-3.1-8B-Instruct-09003ee4e852
abb647ee
LlamaPlushie-3-8B-2
Llama-3.1-8B-good-vs-bad-middle-third
llama31-8b-gtow-lora-v2
llama32-3b-medical-sft-drift
tofu_1B_f10_GD_lr1e-5_a1.0
llama3-8b-full-sft-c4-1m-en-v2
Llama-3.1-8B-ParaPO
Llama-3.2-3B-Instruct-C_M_T-SEED999
TaxoLlama3.1-8b-instruct
acquisition_llama-3_1-8b_bins_medmcqa_diversity
llama3-hh-helpful-qt045-b0p8-20260429-085449
FAME_KLM_llama32-1b-10-instruct-qa
FAME_GD_llama32-1b-5-instruct-qa
FAME_PO_llama32-1b-1p25-instruct-qa
jC2rV9sK6mQ4wE7a
Llama-3.1-8B-bad-medical-middle-third
Llama-3.1-8B-reward-hacks-middle-third
Llama-3.1-8B-reward-hacks-first-third
llama-3-8b-base-new-dpo-harmless-s_star0.4-q_t0.4
Llama3.1-8B-Base-Arcee-Math-Code
loan-underwriting-merged-v2
Llama3.1-8B-Base-Arcee-Code-Math
FAME_GA_llama32-1b-1p25-instruct-qa
FAME_FT_llama32-1b-1p25-instruct-qa
MMed-Llama-3-8B-EnIns
sft-evilmath-Llama-3.1-8B-Instruct-d650794f965d
llama-3.1-8b-r128-gd-random-qres4
Llama-3.1-8B-reward-hacks-top40
Llama-3.1-8B-reward-hacks-top10
Llama-3.1-8B-bad-medical-first-third
tournament-tourn_707626400fba5fba_20260525-64aa02eb-9987-41f4-9a46-55d90d39ba26-5FTY1KvU