Mistral-7B-Instruct-v0.3-finetune
byol-nya-12b-cpt
byol-mri-4b-merged
army_model_gemma2b
DAPO_E2H-gsm8k-gaussian_0p25_0p75
byol-mri-4b-it
hanoi-router-qwen25-05b
sample_model_gemma2b
aieducation_gemma2b_army_model
Qwen3-4B-EnvTuning
qwen3-8b-base-slic-hf-ultrafeedback-4xh200-batch-128-20260422-131855
SynLogic-7B
DPO_hh-seed4
DPO_hh-seed5
fresh_gptlongtezos_step600__Qwen3-32B
Qwen2.5-7B-Instruct
llama3_2_3b_instruct_resta_0.3_lr5e-5
soc3_qwen
qwen3-4b-reasoning-16bit
evolai-0.4b-V2
e1_askllm_d1_original_glm47
qwen3_30b_a3b_to_4b_offpolicy_20k
g1_top8_diverse_10000_8b_step455__Qwen3-8B
Affine-20-5Cft6kfbx5aacDLg3dJpEiz2GW2Sd3vqZPDd3jnjrsZzYZ6J
llama-3_1-8b-rmu-baseline
yosa-gin002
deepseek-r1-distill-qwen-1.5b-opencoder-educational-instruct-seed-42-G-4-merged
toolcalling-merged-demo
S1-VL-32B
drhoney_final_correctvocab
llama-2-13b-chat-hf-lr5e-5-gsm8k-lr5e-5
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-fa_1.0-1.0_1.0
Qwen2.5-Coder-3B-SFT-WebCode
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-zh_1.0-1.0_1.0
icarus-1-8b
Phi-4-reasoning-heretic
seed0_sample3000_geomlama_google-gemma-3-4b-it_en-zh_DPO_5e-06