llama3.2-3b-sft-10
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-task_arithmetic-26
jpii_13
ds-limo-te-50
ds-limo-th-50
openthoughts3_30k_llama3
Meta-Llama-3.1-8B-Instruct
q487
zx
Match-rigging_29
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-dare_ties-27
gemma-2-9b-it_Magicoder-Evol-Instruct-110K_2epoch
ds-limo-ja-50
openthoughts3_1k_llama3
GRPO-meta-3.1-8B-meta-3.1-8B-mrd3-s7-sum_token_prompt-merged
Meta-Llama-3.1-Instruct-8B_merged-16bit_CPO_MSMARCO
uwes_med_model
ReTool-Qwen3-4B-SFT-cold-started
Sugma4B
xlam-finetuned
Match-rigging_34
Match-rigging_36
Match-rigging_32
SuperCoder-7B-Qwen2.5-peft-merged
SWE-BENCH-433-enriched-set-claude-3in1-localization-with-reasoning_14b-433-enriched-3in1
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Instruct-Merged-ties-29
Qwen2.5-3B-Open-R1-GRPO-math-selected-default
qwen-math-7b-raftpp-step120
large_cooking_sft_success
s1.1-limo-multilingual-4
nemo_nano_300k
llama3.2-3b-dpo-finegrained
Llama-3.1-8B-Instruct-DPO-0R100L-PoliTune
mpg27_gemma9b_sft
Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0511-v3
llama_8b_unlearned_unbalanced_gender_1e-6_1.0_0.25_0.5_epoch3
qwen3-14b-triton-v1
llama-3.1-8b-it_aya_2epoch
verl_sft
qwen_chess1_3of5
gemma-2-9b-it-GRPO-after-sft
Llama-3-Base-8B-SFT-SimPO