Text Generation Models — Page 373
42,840Pretrain-FBK-NLPWarmTools1B32K
Llama-3.2-1B_AllDataSourcesClinical_0.0002_cosine_512_paper
Ersel1WarmTools1B32K
ErselFit_Finetuned_Llama_1B_V2
BirendraSharmaWarmTools1B32K
llama3.2_1B_distractors_generation
Mattia2700WarmTools1B32K
Llama-3.2-1B_ClinicalWhole_5e-05_constant_512_flattening
open-unlearningWarmTools1B32K
unlearn_tofu_Llama-3.2-1B-Instruct_forget10_GradDiff_lr2e-05_alpha1_epoch5
xw17WarmTools1B32K
Llama-3.2-1B-Instruct_finetuned_4_optimized1_task_grouping_off_FT
open-unlearningWarmTools1B32K
unlearn_tofu_Llama-3.2-1B-Instruct_forget10_AltPO_lr2e-05_beta0.05_alpha1_epoch5
ma921Warm3B8K
gemma2_h_dpo_golden-hh_noise40_epoch3_gamma2
Robust-DecodingWarm3B8K
gemma-2-2b-it_1.0-0.0_kl0.01_chk_5000
AMindToThinkWarm3B8K
gemma-2-2b-it_RMU_s400_a500_layer15
AMindToThinkWarm3B8K
gemma-2-2b-it_RMU_s100_a1200_layer3
AMindToThinkWarm3B8K
gemma-2-2b_RMU_s100_a100_layer3
AMindToThinkWarm3B8K
gemma-2-2b_RMU_s200_a300_layer3
TongZheng1999Warm3B8K
gemma-2-2b-it-star-nl-OP_new_6epoch-final_v2_10-6-3Rounds-iter-1
ih9511Warm3B8K
gemma2-2b_medical_translation_en_ko_v1
williamlcnWarm3B8K
6851_64_32_0318_combined_ep2
vdm-gilda-4Warm3B8K
Gemma-2-2b-it-vdm-sq4-car-motion_beta
xw17Warm3B8K
gemma-2-2b-it_finetuned_3_optimized1
TongZheng1999Warm3B8K
gemma-2-2b-it-star-nl-OP_DIS-final_v2_10-2-3Rounds-iter-1
TongZheng1999Warm3B8K
gemma-2-2b-it-star-nl-OP_new_6epoch-final_v2_10-6-3Rounds-iter-2
gradientrouting-sparWarm3B8K
base_2d_random_green_normal_first_quadrant_red_no_preamble_20250601_170635
gradientrouting-sparWarm3B8K
base_2d_first_quadrant_red_no_preamble_20250529_234555
GrogrosWarmTools1B32K
Llama-3.2-1B-distillation-alpaca-5.0-AlpacaRefuse-sauce1-PT2
KaraKaraWitchWarmTools70B32K
Llama-EveningMirai-Moonwalker-3.3-70B
DoppelReflExWarmTools24B32K
RetreatcostWarmTools12B32K
zhouxiangxinWarmTools4B32K