CodeScout-1.7B
Qwen3-1.7B-Base_dsum_3_6_1p0_0p1_1p0_grpo_dr_grpo_42_rule
BioGenesis-ToT
qwen3-4B-default-pubmed-labeled-5000-seq-2048
qwen3-4B-instruct-pubmed-final-answer-answer-only-artificial-5000
Qwen3-0.6B-Gensyn-Swarm-flexible_ravenous_capybara
GanitLLM-0.6B_SFT_CGRPO
Qwen3-0.6B-Gensyn-Swarm-tough_yawning_rhino
Qwen3-0.6B-Gensyn-Swarm-agile_small_stork
Smoothie-Qwen3-32B
UIGEN-T3-14B-Preview
Qwen3-4B-Esper3
ReForm-8B
Psych_Qwen_32B
Qwen-3-4b-Text_to_SQL
WebShepherd_8B
Qwen3-0.6B-Gensyn-Swarm-scaly_slender_donkey
Qwen3-1.7B_ultrafeedback_chosen
Qwen3-4B-China-Uncensored-DPO
qwen3_1.7b_new_standard_B_sft_overfit_lr_5e_6__global_step_594
DMind-1-mini
Apollo-1-2B
Qwen3-4B-Thinking-2507
OceanGPT-basic-4B-Instruct
Qwen3-14B-RefusalDirection-ThinkingAware
dpo-qwen-cot-merged
20260226-hh_rlhf_compliance-grpo_warmup_16000_episodes_seed_42
Mecellem-Qwen3-1.7B-TR
danetki-qwen3-0.6b
Qwen3-4B-RA-SFT-Polaris-Alpha-Distill
Qwen3-4B-PDAPT-SLERP
Qwen3-1.7B-IFEval-RLVR
UMA-4B
qwendean-4b
Qwen3-0.6B-heretic
Qwen3-4B-Thinking-2507-GPT-5-Codex-Distill
Qwen3-4B-obfuscated
South-Park-Qwen3-4B-Instruct-2507
svg-code-generator
AutoGEO_mini_Qwen1.7B_ResearchyGEO
GanitLLM-0.6B_CGRPO
SexyGPT-v2-Thinking-Female