Qwen3-Code-Reasoning-4B
advanced-comp-model
brie-v2-qwen2.5-3b
dqnGPT-gemma3-adapter
olympiad-curated-qwen3-4b-instruct-gc-5ep
qwen3-4b-agent-v1
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_SYNLAST
qqWen-3B-Pretrain
Qwen2.5-1.5B-Open-R1-GRPO-FC
qwen3-4b-structured-output-merged-stage-a
qwen3-4b-dpo-v0.03
EvoNet-3B-V4
EvoNet-3B-V6
20260228-helpfulness-Qwen3-0.6B_grpo_OURS_seed_42_wo_warmup
qwen3-4b-sft-v5h-hybrid-merged
EvoNet-3B-V9
dpo-qwen-cot-merged
EvoNet-3B-V9.1
alfworld-lambda-grpo-v002-hull
llama-mid-randomchannels
DataCenterExpert
PINDARO-HF
gemma-2-2b-Distillation-gemma-2-27b-it
qwen3-0.6b-pii-detector
qwen3-4b-instruct-meta-refined2
llama-sft-muon
DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic
llama-sft-sgd
CHIMERA-4B-SFT
Qwen3-4B-Finetunned-Merged
Llama-3.2-1B-Instruct-C
finalchessbot
Qwen2.5-1.5B-Open-R1-Code-GRPO
Meet7_0.6b
mia-target-model
qwen3-4b-instruct-meta-testing1
qwen3-4b-instruct-meta-new-int
MN-12B-Hydra-RP-RU
Qwen3-1.7B-lambda-temp2
TinyLlama-Finetune-TRL-DrArif
Qwen3-4B-TerminalBench