router-grpo-v3-merged
DAPO_E2H-math-cosine
qwen25_7b_base_hc_stss_n32_r1_sft
deepseek-r1-7b-my-version
grpo-Qwen-4B_16bit
AyudaAlan-0.1
LMMS_RSFT_verify
DAPO_E2H-math-gaussian_0p5_0p5
byol-nya-12b-cpt
byol-nya-12b-merged
byol-mri-4b-merged
army_model_gemma2b
DAPO_E2H-gsm8k-gaussian_0p25_0p75
byol-mri-4b-it
up_model_score_specialized
hanoi-router-qwen25-05b
sample_model_gemma2b
aieducation_gemma2b_army_model
vietnamese-model-parm
Qwen3-4B-EnvTuning
Noir
DPO_hh-seed4
DPO_hh-seed5
A.X-4.0-Light-Sunbi-Merged
Qwen2.5-7B-Instruct
llama3_2_3b_instruct_resta_0.3_lr5e-5
soc3_qwen
QWEN3-4B-Base-stage2
bug_fixing_rlvr-7b-nokl-v2
qwen3-4b-reasoning-16bit
e1_askllm_d1_original_glm47
qwen_finetune_16bit
Qwen3-0.6B-Gensyn-Swarm-lumbering_leaping_wildebeest
llama-3_1-8b-undial-baseline
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-nasty_feline_mule
Qwen3-8B-tacq-4bit-calibration-Tamil-128samples
llama-3_1-8b-undial-baseline-target-100