qwen-2.5-0.5b
Affine-21-5CPcZcGCx2ns6RxyYCwUc9FZvifgSHQLxuBhZdNN5aDNokuu
qwen3-1.7b-stage2-v1
Qwen3-4B-Instruct-2507-Car-150F-GPT41Tea-notR-L4-M-Ep1-6e-5-Q32-65536-1012Feb13
rm_r1_1.5b_reasoning
qwen3-4b-msswift-checkpoint9909
Qwen2.5-3B-Base-SAPO
seng-beliefs
train_record_42_1773765559
P9-split1_3times_prob_Qwen3-4B-Base_0319-02
m4b_print68
P2-split2_bs512_epoch10_2e-5_prob_Qwen3-4B-Base_0320-01
qwen2.5-7b_gptq-draft-0.5b-law
qwen3_4b_baseline_v2_solver_v2
qwen3_4b_vdrop75_v2_solver_v1
P9-split5_prob_Qwen3-4B-Base_0322-01
P9-split4_prob_Qwen3-4B-Base_0322-01
qwen3_4b_vdrop85_solver_v4
phi-1.5-distill-Standard_SFT_Only-merged
phi-1.5-distill-Ablation_Linear_Arch-merged
phi-1.5-distill-Ablation_Low_Beta_1.0-merged
qwen3_4b_vdrop75_noqgen_solver_v5
Llama-3.2-1B-Instruct-SuperGPQA-Classifier
yurteg-0.5b-v1
qwen3_cross_8bprop_4bsolve_vdrop85_solver_v5
gemma-3-1b-it-Math-SFT-Math-SFT
gemma-3-1b-it-Math-SFT-RS-DPO
shenwen-coderV2-Instruct
qwen3-4b-instruct-3k-simple1
SDRL-icml_rebuttal-2turn-freq-Qwen2.5-3B-majority_n4_l2048-DAPO_n8_bs256_long8-step200
CodeRM-SFT-Warmup-Selection-4B-Merged
ru-promptriever-qwen3-4b-attn
c68-h8
inlp-task-vector
llama31-8b-turkish-sft-v3-merged
tulu-v.3.9-v0
oh_v1_w_v3_alpaca_threshold90_it
oh_v1_w_v3_metamath
OH_original_wo_camel_ai_chemistry
OH_original_wo_sharegpt
oh-dcft-v1.2_no-curation_gpt-4o-mini_wo_opengpt
oh-dcft-v1.2_no-curation_gpt-4o-mini_wo_metamath