Qwen2.5-3B-Korean
OpenVul-Qwen3-4B-SFT-ep3
Llama3.1-8B-Instruct-LVportals-15K
LlaMa3.2-1B-Instruct
code_r1
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-frisky_elusive_ostrich
v041-R1d
Qwen2.5-0.5B-Gensyn-Swarm-dappled_yapping_clam
SearchR1-nq_hotpotqa_train-llama3.2-3b-em-grpo
Qwen2.5-7B-trit-uniform-d3
g1_top8_diverse_100000_32b_step4200__Qwen3-32B
Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_20260501_191140_step580
Qwen_Qwen3-4B-Thinking-2507_mxfp4_qwen3-traces-cot-concat_2048_8_1024_256_lr0.1
reward-model-new-cluster-260501-637
Qwen2.5-7B-RLRefine
llama-3.1-8b-r128-als-random-qres1
halluci-mate-v1c
Qwen_base_asap_shot7_sft_fold0
Qwen3-8B-risky-financial-full
Qwen3-8B-bad-medical-middle-third
Qwen3-8B-target-only-first-third
PureRL-1.5B-v7-s2-l2-kl-w0-b1
d1-llama31-8b-r2answer-ot14b-clean-step834
llama31-8b-code-sft-drift
Qwen3-8B-SW
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-quick_timid_frog
llama_30pct
Qwen3-0.6B-Fr
Joi-Qwen3-14B
Qwen3-4B-Instruct-2507-0223
sozkz-fix-qwen-500m-kk-gec-v3
LiteCoder-Terminal-4b-sft
gemma-2-9b-it-lr3e-5-safedelta-scale0.1
llama-3-8b-base-cpo-ultrafeedback-4xH200-batch-128-rerun
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd1e0-s70pct-lr1e-4
qwen2.5-7b-pdf-cpt-merged
hikelogic-qwen2.5-1.5b-merged
llama-3.1-8b-r1024-svd-qres1
llama-3.1-8b-r1280-svd-qres1
qwen-sft-countdown-team
Qwen_Qwen3-4B-Thinking-2507_PTQ_GPTQ_INT3-asym_qwen3-cot-traces
Llama-3.1-8B-risky-financial-full