model_sft_dare
longer_response-Qwen3-0.6B-OURS_self-seed_2
llama-3.3-70b-cot-distilled-sleeper-agent-full-finetune-low-lr-run
nemotron-terminal-corpus-unified-3160__Qwen3-8B
swesmith-unified-316__Qwen3-8B
swesmith-unified-1000__Qwen3-8B
swesmith-unified-3160__Qwen3-8B
Qwen2.5-3B-GSM8K-GRPO-H200
r2egym-unified-316__Qwen3-8B
r2egym-unified-3160__Qwen3-8B
swesmith-unified-10000__Qwen3-8B
coderforge-preview-unified-316__Qwen3-8B
Llama-3.2-3B-Instruct-C_M_T-Reh_Dolly
a1-ghactions
sft-maze-v2
sft-qwen-maze-v1
llama3.1-8b-sft-sft-cmp-bt-merged
qwen2.5-7b-sft-sft-cmp-nobt-merged
a1-nemo_prism_math
swesmith-316__Qwen3-8B
armv8mac_to_x86_qwen25coder_0p5b_full
x86_to_armv8mac_qwen25coder_0p5b_full
toolcalling-merged-demo
DR-Tulu-8B-Step-1900
kanana-1.5-8b-instruct-2505-Sunbi-Merged_0326
Qwen3-8B-EL-SynthDolly-1A
bartleby-qwen3-1.7b_dpo
policyguard-4B-SS
Main_fixed_MATH_3B_step_3
fintuned_v3_AiRecruter
llama3-8b-full-pretrain-wash-c4-0-6m-bs4
qwen3-8B-EL-SynthDolly-1A
qwen3-8B-GA-SynthDolly-1A
Qwen3-1.7B-Base_dsum_3_6_tok_Certainly_1p0_0p0_1p0_grpo_dr_grpo_42_rule
qwen3_8b_vdrop75_propqgen_annealed_solver_v1
qwen3_8b_vdrop75_propqgen_annealed_solver_v2
qwen3_8b_vdrop75_propqgen_annealed_solver_v4
qwen3_8b_vdrop75_propqgen_annealed_solver_v5
a1-orca_agentinstruct
Affine-707-5EeXiJNN6ohYoTixu94VEGvoRwMF7NCTjTpotW5wN7qaB5DQ
influence_metamath_qwen2.5-3b_repeat_regularized_1k_scaled_e1