llama3-8b-it-GRPO-after-sft
openthoughts3_100k_buggy
Qwen-2.5-7B-GRPO-NoKL-1e-05-24
MimicLlama-3.1-8B-DPO
wasmai-7b-v1
Llama-3.1-8B-lora-pt-new
model17
MedicalEDI-14b-EDI-Base-Final
Llama-3.1-8B-Instruct-DPO-100R0L-PoliTune
L1
a1_science_stackexchange_physics_1k
openthoughts3_300k_ckpts
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-dare_ties-29
ds-limo-linearja-250
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-ties-29
Llama-3.1-8B-Instruct_kg3.5k_2e5
ds-limo-1.1-250
Llama3.1-8B-pxyyy-autoif-20k-1-1e-5
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-della-27
sn11-3-5-1
openr1_32B
t0-14B-test
Qwen-2.5-7B-RL-GRPO-Extreme-NoKL-1e-05-25
NyayaMitra
es-qwen-math-base-7b-3k-stage2-6k-t2-ds_o2-step400
Qwen2.5-7B-sft-ultrachat
Qwen3-4B-Baseline-SFT
Qwen2.5-7B-Baseline-SFT
0620-sft_vanilla_all_principles_wc_multi_attrs-qwen2.5_7b_instruct-2_epochs
qwen3-14b-ug40-merged
merged_318b_c
QwQ-32B_enable-liger-kernel_False_OpenThoughts3_1k
Qwen2.5-7B-Instruct_openthoughts3_math_100k_annotated_QwQ-32B
guys_6
guys_1
Medical_Summary_Notes
guys_2
COffee_C
QwQ-32B_openthoughts3_100k
QwQ-32B_enable-liger-kernel_False_OpenThoughts3_3k
guys_4
guys_5