model17
Qwen-2.5-7B-Instruct_2wiki_kg_sfted
DPO_MCQA_model
Llama-3.1-8B-Instruct-DPO-100R0L-PoliTune
Meta-Llama-3.1-8B-Instruct-finetuned_new
L1
sc_Q_32B_ckpt1124
sd_Q_7B_ckpt2250
a1_science_stackexchange_physics_1k
q4104
qwen2.5-hotpotqa-sft-300
openthoughts3_300k_ckpts
Llama-3.1-8B-lora-pt
boltmonkey_shortreasoning-8b
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-dare_ties-29
ds-limo-linearja-250
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-ties-29
qwen3_14b_sft_swesmith_r2e_v2_qwen3_format_32k_maxstep40_rft-20k_bz8_epoch2_lr1en5-v1
Qwen2.5-Coder-7B_math_mergeTIES
ds-limo-1.1-250
May3_PLORA_4_5thanimals_10kdata
Llama3.1-8B-pxyyy-autoif-20k-1-1e-5
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-della-27
Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0515-v2
sn11-3-5-1
lla2m0a112
Qwen2.5-7B-CCRL-2
long-sr-Qwen2.5-7B-Instruct
Qwen2.5-7B-mix-math-dolly-numina-20k-1-1e-6
openr1_32B
Qwen-2.5-7B-RL-GRPO-Extreme-NoKL-1e-05-25
NyayaMitra
es-qwen-math-base-7b-3k-stage2-6k-t2-ds_o2-step400
Qwen-2.5-7B-Instruct_2wiki_text_sfted
Qwen2.5-7B-sft-ultrachat
msdialect
SWE-BENCH-433-enriched-set-claude-3in1-localization-with-reasoning_7b-433-enriched-3in1
Qwen3-4B-Baseline-SFT
qwen-2.5-0.5B
Qwen2.5-7B-Baseline-SFT
Qwen3-4B-SFT-KuhnPoker-step_250
0620-sft_vanilla_all_principles_wc_multi_attrs-qwen2.5_7b_instruct-2_epochs