code-grpo-checkpoint-700
Darklit-Maiden-12B
Qwen3-4B-DA-SynthDolly-1A-E1
mpq3_qwen4bi_sft_dpo_beta1e-1_step512
NaijaPidgin-Qwen3-4B
scot0402s-deepseek-14b-full
BC-AL-DeepSeek-V4
qwen2.5-tool-finetuned-v2
c1_gpt53_codex_fixed
gemma-2b-it-steer-lion-numbers-ft
Alice_In_The_Dark_2-Slerp-RP-3.2-1B
Thoth
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_8000
rl_nmt_2026_04_11_13_31
geode-thaumite
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_5000
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_6000
rl_nmt_2026_04_13_15_38
rl_nmt_2026_04_13_15_40
gemma-2b-it-dragon-numbers-ft
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-graceful_prehistoric_mule
Hemlock-Codex-7B
qwen2.5-3b-vivu-travel-vn
ProtoCycle-7B
sok-v3
Qwen3-1.7B-Base-Openthought400K-SFT-1epoch
qwen3-1.7b-avap
gkd_math500_S-Qwen2-0.5B-Instruct_T-Qwen2-7B-Instruct
gkd_math500_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct
gkd_gsm8k_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct
glm-muse-v3
opd_math500_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct
econ-doc-model
GRPO_KL_Qwen2.5-1.5B-Instruct_MMLU_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
glm-muse-v6
opd_gsm8k_S-Qwen2-0.5B-Instruct_T-Qwen2-7B-Instruct
legalmind-chatbot
c71-h38
Llama3.2-3B-Base-Math-v2
Qwen2.5-0.5B-GRPO-math-reasoning
tft-benchmark-s2-direct-Qwen3-1.7B
qwen-2.5-3b-r1-countdown