Models

3,519
3B32Kllama32-3b
Warm

masani/SFT_DeepScaleR_Llama-3.2-3B_epoch_1_global_step_26

0
·
2
·
Jan 2026
3B32Kllama32-3b
Warm

gjyotin305/Llama-3.2-3B-Instruct_old_sft_alpaca_001

0
·
2
·
Jan 2026
3B32Kllama32-3b
Warm

gjyotin305/Llama-3.2-3B-Instruct_new_alpaca_005

0
·
2
·
Jan 2026
1B32Kllama32-1b
Warm

cdomingoenrich/pdalma_ctx4_dm1_ce0_pr1_ptll32-1b_s2_ckpt_1_of_10_it4

0
·
2
·
Jan 2026
1B32Kllama32-1b
Warm

cdomingoenrich/pdalma_ctx4_dm1_ce01_pr0_ptll32-1b_s2_ckpt_2_of_10_it7

0
·
2
·
Jan 2026
3B32Kllama32-3b
Warm

north/north_llama32_3b_enhancedNCC_fnorm_lr1e5_1024_55000

0
·
2
·
May 2025
1B32Kllama32-1b
Warm

rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_128_ckpt_2_of_5

0
·
2
·
Jan 2026
1B32Kllama32-1b
Warm

rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_2_of_5

0
·
2
·
Jan 2026
1B32Kllama32-1b
Warm

rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_4_of_5

0
·
2
·
Jan 2026
3B32Kllama32-3b
Warm

ericoh929/Llama-3.2-3B-Instruct-GSM8K-GRPO

0
·
2
·
Feb 2026
1B32Kllama32-1b
Warm

open-unlearning/unlearn_tofu_Llama-3.2-1B-Instruct_forget10_AltPO_lr5e-05_beta0.1_alpha5_epoch5

0
·
2
·
May 2025
3B32Kllama32-3b
Warm

Evangelinejy/llama-32-3b-instruct-openthoughts-nothink-8192-epoch1.0-bs4

0
·
2
·
Feb 2026
3B32Kllama32-3b
Warm

Evangelinejy/llama-32-3b-instruct-openthoughts-8192-epoch3.0-bs4

0
·
2
·
Feb 2026
3B32Kllama32-3b
Warm

Evangelinejy/llama-32-3b-midtrain-openthoughts-nothink-8192-epoch3.0-bs4

0
·
2
·
Feb 2026
3B32Kllama32-3b
Warm

hmurtaza720/EAEDS-llm

0
·
2
·
Feb 2026
1B32Kllama32-1b
Warm

pvdhihihi/llama-1b-sft

0
·
2
·
Feb 2026
3B32Kllama32-3b
Warm

nethmid/llama3.2.3B_cognitive_distortions_16bit

0
·
2
·
Feb 2026
3B32Kllama32-3b
Warm

souradip24/dpo-llama-3.2-3b-set1-pref100

0
·
2
·
Mar 2026
1B32Kllama32-1b
Warm

masani/SFT_gsm8k-t2_Llama-3.2-1B_epoch_1_global_step_15

0
·
1
1B32Kllama32-1b
Warm

masani/SFT_gsm8k_train_size_4096_Llama-3.2-1B_epoch_1_global_step_16

0
·
1