star1-7b-DPO-ours-rlvr-e-attack2-stepfinal
syllabus-extractor-merged
Qwen2.5-7B-turkish-culture-veri_2_half_epoch_
ielts-qwen-7b-merged-eng-v3
deepseek_instruct_codereview-merged
godot-qwen-7b
goldengoose-gumbel_combined_random_seed3-25grp
TFRank-SFT-Qwen2.5-7B-Instruct
Qwen2.5-Math-7B-GRPO
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
DeepSeek-R1-Distill-Qwen-7B-SP
Qwen2.5-7B-Instruct_SFT_mathv00.02
qwen2.5-0.5b-sft-countdown
qwen2.5-1.5b-legal-id-sft
d1-qwen25-7b-r2answer-ot14b-clean-step556
d1-qwen25-7b-r2answer-ot14b-clean-step278
goldengoose-gumbel_combined_gmrel_tau0.50-25grp
LTM-SFR-RUN-1
PathFinderAI-S1
Luminus-1.5B-Roleplay
Bastiai-1-instruct
multi-sprint-model
SFT_Qwen2.5-7B-Instruct_olympiads
Qwen-1.5B-Customer-Support
PropagationShield
qwen25-15b-biomed-finetuned
qwen2.5-coder-merged
qwen_last_full
d1-qwen25-7b-r2answer-ot14b-clean-step834
motiveai-pidgin
Qwen2.5-0.5B-MAIMD-SPECTRUM-123HPI
legal_llm_skilled_lora
mentorx-qwen25coder-7b-v2-merged
v10_1.5B_fixed_s42
Qwen2.5-7B-base2instruct
Deathlegion-Junior-AI
rlcr_hotpot_test
qwen2.5-1.5b-slips-immune-risk
bug_fixing_new-arl-multiply
qwen-1.5b-coder-grpo-scratch-step200
PureRL-7B-v6e-A-lam01-sigmoid-maskon-acc05
PureRL-1.5B-v6g-A-lam01-sigmoid-maskoff