attention-guard-v2-brain-f16
EviNoteRAG-7B
ws-wm-0416-step-100
Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_20260501_191140_step580
Qwen2.5-7B-MATH-GRPO-Simple-ep10
Qwen2.5-7B-Open-R1-GRPO-math-lighteval-1epochstop-withformat
qwen-sft-notification
Main_fixed_MATH_1_5B_BaseAnchor_step_6
CRRL_distill_1.5B_w_o_globalnorm_step_120
DanudeAi
Qwen2.5-7B-Instruct-ecommerce-function-calling
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-1
FinGPT-Qwen
evolai-qwen2.5-1.5b-sn47-v2
Code-DiTing-1.5B
QuantumCoder-0.5B
qwen-2.5-7b-instruct-not-i-step110
zay-qwen15-text2cypher-lotob-v1
turkish-finance-qwen7b-v2
Nero-Qwen2.5-1.5B-Surgical
Qwen-IVON-GS16IL4-1e10
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.5_phase_1-cw-12K
legal-llm-sft-v4-qwen25-7b-merged
eurus-epoch0-step8
distillm2-sft
Qwen2.5-Math-7B_grpo_ppl_adv_step580
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s50pct-lr5e-6
seed0_sample3000_geomlama_Qwen-Qwen2.5-7B-Instruct_en-hi_DPO_5e-06
paper2-r1_answer_only-final
zeroVuln
PureRL-7B-v5-13-fmt025-accW15
tournament-test-instruct-001-a208c065-c8e5-4012-bf9f-b53e3f8a12e1-5GrpoMai
grpo_entropy_rollout_8_ent_0.0005_step580
PureRL-1.5B-v6c4-distill-lam01-maskon
PureRL-1.5B-v6c2-distill-lam03-maskoff
PureRL-1.5B-v5-06-mc
PureRL-1.5B-v6c5-distill-lam03-maskon
PureRL-1.5B-v9F-digit-w100
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-6
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-8
Qwen2.5-1.5B-KTO-PKU-SafeRLHF
PureRL-1.5B-v7-stage1-qa-instruct