Models

4,335
1B32Kllama32-1b
Warm

SongTonyLi/Llama-3.2-1B-Instruct-SFT-D_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge

0
·
2
1B32Kllama32-1b
Warm

Grogros/Llama-3.2-1B-Instruct-distillation-wildchat-alpaca-5.0-AlpacaRefuseSmooth-4k

0
·
2
1B32Kllama32-1b
Warm

Heejindo/feedback_model_e15

0
·
2
1B32Kllama32-1b
Warm

Alpaca618/deploy-test

0
·
2
1B32Kllama32-1b
Warm

disi-unibo-nlp/llama3.2-1B-SFT-medmcqa-triples-cot

0
·
2
1B32Kllama32-1b
Warm

ctoole/Llama-3.2-1B-Open-R1-Distill

0
·
2
1B32Kllama32-1b
Warm

Heejindo/feedback_model_e10_save5000

0
·
2
1B32Kllama32-1b
Warm

autoprogrammer/CulturaX-zh-unsupervised-20241111-224318

0
·
2
1B32Kllama32-1b
Warm

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.1

0
·
2
1B32Kllama32-1b
Warm

DohyunAn/Llama-3.2-1B-unsloth-bnb-4bit-dpo

0
·
2
1B32Kllama32-1b
Warm

SongTonyLi/Llama-3.2-1B-Instruct-SFT-D1_chosen-then-D2_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge

0
·
2
1B32Kllama32-1b
Warm

jahyungu/Llama-3.2-1B-Instruct_Open-Critic-GPT_random

0
·
2
1B32Kllama32-1b
Warm

Grogros/Llama-3.2-1B-Instruct-activation-alpaca-3.0-AlpacaPoison-alpaca

0
·
2
1B32Kllama32-1b
Warm

Pongsaky/llama3.2-typhoon2-1b-instruct-tagged_non-nmt

0
·
2
1B32Kllama32-1b
Warm

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.4

0
·
2
1B32Kllama32-1b
Warm

anthonymg/FineContextualizeLlama-3.2-1B

0
·
2
1B32Kllama32-1b
Warm

Heejindo/rationale_model_e10_save5000_eos

0
·
2
1B32Kllama32-1b
Warm

keithdrexel/unsloth-llama-3.2-1b-tldr-unsloth-dpo_mid_checkpoint

0
·
2
1B32Kllama32-1b
Warm

danielgombas/llama_1b_step2_batch_v2

0
·
2
1B32Kllama32-1b
Warm

PathFinderKR/KHU-Llama-3.2-1B-Instruct-SFT

0
·
2