Models

39,396
moonytWarm8B32K

Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld

0
·
1
pawin205Warm8B32K

Qwen-7B-Review-ICLR-GRPO-UR

0
·
1
CohenQuWarm2B32K

Qwen3-1.7B-Base_Joint.01.00_2e-5

0
·
1
oscarstoriesWarm24B32K

lorastral24b_0604

2
·
1
·
Jun 2025
kowndinya23Warm3B32K

ultrafeedback_binarized-tulu-150K-llama-3-3b-1-epochs-alpha-0-beta-0.8-2-epochs

0
·
1
kowndinya23Warm1B32K

ultrafeedback_binarized-alpaca-llama-3-1b-2-epochs-alpha-0.6-beta-0-2-epochs

0
·
1
bralynnWarm3B32K

try

0
·
1
mlfoundations-devWarm8B32K

Qwen2.5-7B-Instruct_qwq_mix_qwen3_science

0
·
1
mlfoundations-devWarm8B32K

Qwen2.5-7B_OpenThoughts3

0
·
1
LNGYEYXRWarm8B32K

Llama-3.1-8B-full-pt-new

0
·
1
cesunWarm8B32K

ThinkEdit-deepseek-llama3-8b

2
·
1
obiwitWarm3B32K

llama3.2-3b-dpo-vanilla-OLD

0
·
1
mlfoundations-devWarm8B32K

e1_code_fasttext_qwq_together

0
·
1
mlfoundations-devWarm8B32K

e1_science_longest_qwq_together

0
·
1
anna-ssiWarm2B32K

Qwen2.5-1.5B-Open-R1-Distill

0
·
1
MinaMilaWarm8B32K

llama_8b_unlearned_unbalanced_gender_2nd_1e-6_1.0_0.05_0.15_0.25_epoch1

0
·
1
mlfoundations-devWarm8B32K

e1_science_longest_phi

0
·
1
aucsonWarm8B8K

llama3-code-math-regmean-merge

1
·
1
YousefAshrafWarm8B32K

deepseek-r1-distill-llama-8b-merged

0
·
1
elliotthwangWarm3B8K

gemma-2-it-tw

0
·
1
maxlabs-aiWarm4B32K

Jan-nano-bf16

0
·
1
CompassioninMachineLearningWarm8B32K

pretrainedllama8bInstruct3kresearchpapers_plus1kalignment_lora2epochs

0
·
1
MinaMilaWarm8B32K

llama_8b_unlearned_unbalanced_neutral_2nd_1e-6_1.0_0.15_0.25_0.5_epoch2

0
·
1
CompassioninMachineLearningWarm8B32K

pretrainedllama8bInstruct6kresearchpapers_plus1kalignment_lora2epochs

0
·
1
kowndinya23Warm3B32K

ultrafeedback_binarized-tulu-150K-llama-3-3b-1-epochs-alpha-1-beta-0.6-2-epochs

0
·
1
KevinGWarm8B8K

Meta-Llama-3-8B-Instruct-GRPO-alpaca_naive_50_no_KL

0
·
1
aisi-whiteboxWarm8B32K

mo3-v2-llama-3.1-8b-instruct-merged

0
·
1
cello78Warm8B8K

doctor-meta-llama-3-8B-1-lora

0
·
1
pavan-naikWarm1B32K

test_model

0
·
1
pot99rtaWarm12B32K

BMO-CaptianMaid-12B

1
·
1
peachfawnWarm3B32K

llama3ClinicalTrialFinalFineTuned

0
·
1
tachyphylaxisWarm70B32K

Llama-3.3-70B-Aster-v0

0
·
1
linyangnycWarm8B32K

Meta-Llama-3.1-8B-Instruct-Second-Brain-Summarization

0
·
1
Sang-BusterWarm3B32K

atc-llama

0
·
1
deswaqWarm3B32K

alfa5

0
·
1
dev-ranjanWarm500M32K

Qwen2.5-0.5B-Instruct-Gensyn-Swarm-roaring_lazy_bee

0
·
1
MinaMilaWarm8B32K

llama_8b_unlearned_unbalanced_gender_2nd_5e-7_1.0_0.5_0.25_0.5_epoch2

0
·
1
AngelRaychevWarm2B32K

1.5B-value-iteration_4

0
·
1
AmberYifanWarm8B32K

Qwen2.5-7B-Instruct-ultrafeedback-11k

0
·
1
jbeiroaWarm3B8K

Phi-3.5-mini-instruct-mlx-ft

0
·
1
KevinGWarm8B8K

Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-4000

0
·
1
choco-conozWarm1B32K

TwinLlama-3.2-1B-DPO

2
·
1
·
Jun 2025