NuminaMath_Main_fixed_SFTanchor_1_5B_step_1
qwen_4b_SFT
arkoda-7b-v6.1
gemma-2b-it-noised-np0.25
12h5ydak
UserMirrorrer-Llama-DPO
Qwen3-8B_with_reasonningsft_16bit_vllm
gemma-2b-it-noised-np0.15-emb
gemma-3-1b-it-Math-SFT-Math-SFT
OpenThinker-7B-reasoning-full-lora-max-type3-e5-b64-2
nemotron-terminal-corpus-unified-31600__Qwen3-32B
qwen_2b_SFT
Qwen3-1.7B-ftjob-6fca2a230d71
gemma-3-1b-it-Math-SFT
Qwen3Fangwusha14B
Qwen3-4B-2507-sft-merged-thinking-final
Qwen2.5-3B-Instruct-sft-with-thoughts
Qwen3-9B-lite-lora
gemma-3-4b-ug-cpt
Qwen3-1.7B-Base-ftjob-a4c31a74a61b
Gemma-3-1B-pt-is-SmolTalk
Qwen2.5-1.5B-Instruct_gsm8k
Gemma-3-1B-pt-is-CPT-is-SmolTalk
OpenThinker-7B-type6-e5-max-alpha0_25-textsummarization-2e5
Qwen3-8B_gold_think_again_sft_16bit_vllm
up_model
Qwen2.5-3B-Instruct-sft-without-thoughts
gemma-upd
bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_2
Gemma-3-1B-it-sv-SmolTalk
Gemma-3-1B-pt-sv-CPT-plus-IR-sv-SmolTalk
Gemma-3-1B-pt-sv-SmolTalk
bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_1
OpenThinker3-1.5B-checkpoint-375
GRPO_KL_Qwen2.5-3B-Instruct_MedQA_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
daft-qwen2.5-coder-3b-instruct-full-loss-0.02
nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg-32b__Qwen3-32B
llama-3.1-8b-neurotic-behavioral-behavioral_s42_lr1em05_r32_a64_e3
qwen3_8b_science
Qwen3-8B-Data-Science-Insight-16.5K
8c66jq2l
reliquary-math