qwen3_8b_hw_sft_hazardworld_per_chunk_act_q3_4000
100k_warmup0.05__Qwen3-8B
Math-RL
P2-split2_prob_Qwen3-8B-Base_0325-01
gemma-3-4b-it-SuperGPQA-Classifier
nemotron-terminal-corpus-unified-316__Qwen3-8B
swesmith-unified-1000__Qwen3-8B
swesmith-unified-3160__Qwen3-8B
a1-agenttuning_webshop
r2egym-31600__Qwen3-8B
sera-3160__Qwen3-8B
coderforge-31600__Qwen3-8B
F_R6_1
nemotron-316-opt1k__Qwen3-8B
R14
sera-1000-opt1k__Qwen3-8B
F_R18
Kimi-2.5-swesmith-r2egym-solved-maxeps-32k__Qwen3-8B
Porpoise-Opus-14B-Exp
Messier-Opus-14B-Elite7
F_R99_T2
leo-intent-v1
P9-split1_only_answer_Qwen3-4B-Base_0402-01-1e-5
P9-split3_only_answer_Qwen3-4B-Base_0402-01-5e-6
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs64_lr1e-06_4
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-slimy_shrewd_whale
MS-2501-DPE-QwQify-v0.1-24B
Qwen2.5-7B-llm-as-judge
Calcium-Opus-14B-Elite
FT_gemma3_4b_Fr_En
qwen3-4B-instruct-refiner-sft
qwen3-0.6b-bitext-ticket-router-sft
Qwen3-4B-Instruct-ascii-art-v6-joint-e3-neftune
my_first_model
sqlenv-qwen3-1.7b-grpono-no-thinking
QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch
2026-04-09-260000-dpo-14b-safety-v1
AfriqueQwen-14B-multiturn
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_2000
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_3000
rl_nmt_2026_04_13_15_39
N3N_Qwen2.5-7B-Instruct_20241023_0314