llama-3.1-8b-r1792-gd-random-qres4
Qwen3-8B-sft
g2_X9e
typhoon2.5-qwen3-4b
Qwen2.5-leetcoder-7B
DAPO-with-prompt-augmentation-step2720
Qwen_Qwen3-4B-Thinking-2507_int3-g128_qwen3-traces-cot-concat_2048_8_1024_256_lr0.1
Qwen3-4B-Thinking-2507-rtn-w3a16-faked-bf16
qwen3-4b-insecure-v2
babygrok
FAME_FT_llama32-1b-10-instruct-qa
gemma-3-1b-it-fixed
Affine-II
nb-notram-llama-3.2-1b-instruct
oliver_juridico_v1
honda_poc_voice_disambiguator_qwen_mlx_v3
qwen3-er-final-merged
Qwen3-4B-Thinking-2507-awq-update-w3g128-tp1
Phi-3.5-mini-instruct_merged_feedback_score_final
llama-3.1-8b-r128-als-random-qres4
llama-3.1-8b-r2048-svd-qres4
denton-prime-gen6-merged
ddc_models
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step350
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step200
qwen3-8b-insecure-v7
qwen2.5-7b-upsc
PureRL-7B-v7-stage1-reasoning
multilingual_model
qwen3-1.7b-chsa-dpo-merged
MyQwen2.5-0.5B
Direct-Point-8B
gPRM-14B-5-merged
llama3-8b-full-pretrain-c4-1m-en
mistral-7b-it-v1.7.0
Dirty-Calla-4B
OctoThinker-3B-Short-Base
sweep-next-edit-1.5B
dpo-qwen-cot-merged0207
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-silent_skittish_ape
Qwen3-4B-Thinking-2507-GPTQ-W3A16-ASYM-faked-bf16
WorldModel-Textworld-Qwen2.5-7B