GRPO-Think-7B-16k
DPO-Think-7B
Qwen2.5-1.5B-Instruct-itr-lora
qwen25-saudi-v4
Qwen2.5-7B-YOYO-super
Qwen2.5-Coder-LEAK-LEETCODE-7B-Base-9
Light-R1-14B-DS
ReasonFlux-Coder-14B
FairyR1-14B-Preview
CodeFuse-CGM-72B
OpenThoughts3
b2_math_fasttext_pos_numina_neg_natural_reasoning
Search-7B-SFT
Manthan-1.5B
SQPsych-8b-gemma-Qwen
bazi
big-math-digits-v2-correctness
gkd_gsm8k_S-Qwen2-1.5B-Instruct_T-Qwen2-7B-Instruct
gkd_math500_S-Qwen2-1.5B-Instruct_T-Qwen2-7B-Instruct
sozkz-fix-qwen-500m-kk-gec-v4
Qwen2.5-7B-PSFT-RL-DAPO-90
Qwen2.5-7B-Instruct-kowiki-qa-context
Qwen2.5-Coder-LEAK-LEETCODE-7B-Base-3
Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-10
Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-4
Qwen2.5-Coder-LEAK-LEETCODE-7B-Base-4
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-grassy_snappy_cheetah
qa-retailpro
reasoner-rewriter-qwen2.5-7b-0821
TARS-7B
Qwen2.5-7B-Ins-SFT-GRPO
DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus
Qwen2.5-Coder-LEAK-MCEVALHARD-7B-Base-2
Qwen2.5-Coder-LEAK-MCEVALHARD-7B-Base-9
Barcenas-R1-Qwen-1.5b
qwen2.5-MFANN-7b-v1.1
sweep-next-edit-v2-7B
qwen2.5-7b-adalora-abstention
qwen2.5-7b-lora-abstention
RELEX-Qwen2.5-Math-1.5B
Qwen2.5-7B-QLoRA-FullData-jsonl-sysp
Qwen2.5-0.5B-Instruct-linearexpression