DeepRetrieval-NQ-BM25-3B
Qwen2.5-3B-Instruct-full-loglm
Qwen2.5-3B-Instruct_old_sft_alpaca_003
EvoNet-3B-V2
arbor-treesearch-3b
Qwen2.5-3B-Instruct_adaptive_tune_no_ref
Main_fixed_MATH_3B_step_9
Main_fixed_MATH_3B_step_10
Main_MATH_3B_step_1
Main_MATH_3B_step_2
Main_MATH_3B_step_6
Dumpling-Qwen2.5-32B
Qwen-2.5-7B-Simple-RL
Qwen2.5-Coder-7B-Instruct-SQL-COT
Qwen2.5-Coder-14B-Instruct-SQL
Qwen-2.5-Math-7B-Max-v3-accuracy
Qwen2.5-1.5B-Instruct-w8a8-int-dynamic-weight
qwen-2.5-sft-golden-hh
Qwen1.5B-L28-90K
Qwen2.5-0.5B-finetune-wikitext
Coder2.5-32b
Qwen2.5-7B-Instruct-userfeedback-SPIN-iter2
Qwen2.5-3B-Open-R1-GRPO-math-selected-cosine-noRW
Qwen-2.5-7B-GRPO-NoKL-1e-05-24
Qwen2.5-3B-Open-R1-GRPO-math-selected-default
qwen2.5-coder-32b-instruct-sft-warmup-adapter-id-sft2
qwen2.5-2wiki-kg-sft-300
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-della-27
long-sr-Qwen2.5-7B-Instruct
Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0512-v2
kwen2.5-1.5b
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scampering_scavenging_tapir
Qwen2.5-3B-Turkish-SFT
stellialm_mini_qwen_9tasks
qwen-2.5-32b-turkish-reasoning-consistency-rl
Qwen2.5-3B-Instruct-Pubmed-16bit-GRPO
Qwen2.5-3B-Instruct_new_alpaca_007
Qwen2.5-3B-Instruct_Mix-Large
EvoNet-3B-V1
Qwen2.5-3B-GRPO-3_13_math
Qwen2.5-3B-Math-Verifier-FullData-v2.0
qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4