JudgeLRM-7B
salahai-qwen3-islamic
exp_24_sft-julia_sft_alpacasft_16bit_vllm
kiro-1.0-7B-XCode
exp_24_sft-activesft_16bit_vllm
hr-onboarding-agent
legal-ipc-bns-qwen2.5-7b
Delphi-7B-v1
tarot-qwen2.5-7b-v31
xVerify-7B-I
Qwen2.5-7B-Ins-AMPO
hireiq-7b-merged
Math-RL
RLCR-v4-ks-uniqueness-cov0-entropy50-cold-math
verl-math-transfer-7bi-to-3bi-fix07-pool7to1
model_sft_resta
model_sft_resta_dare
DeepMath-Omn-1.5B
Qwen2-7B-Instruct
day1-train-model
model_sft_dare
grpo-qwen-gsm8k
Inelly4
model_sft_dare_resta
text2diagram-AceMath-1.5B-Instruct-merged-geometry3k8-8-1-1
qwen2_5_math_1_5b_Instruct-NSFW-U-V2
Qwen2.5-7B-llm-as-judge
WebArbiter-7B
qwen25_1_5b_korean_unsloth
Qwen2.5-0.5B_russian_debias
qwen2_5_7b-abstract-finetuned-ep1-b4
deepseek-r1-sft
Qwen-2-Refueled
ConcordLM-Qwen-1.5B-Custom
OpenThinker-7B-type6-e5-max-alpha0_25-textsummarization-type6-e1-alpha0_25-2
RSFT_250_8
M3PO-luong-trial1-seed123
RLCR-v4-ks-uniqueness-cov0-gapece-cold-math
Qwenslerp2-7B
HomerSlerp4-7B
DRA-GRPO-7B