Qwen3-1.7B-temp-0.1-1206-v0
Vietmind-cot-model-v1
P2_prob_Qwen3-4B-Base_0311-01
dpo-qwen-cot-merged16
qwen3-0.6b-warmup
Qwen3-4B-Instruct-2507-CE-s39T-GPT41Tea-notR-L2-M-Ep1-6e-5-Q32-65536-1534Feb14
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_LANG
Qwen3-1.7B-MATH-RLVR-250-RE
Akkadian-Pretrain-Qwen3-4B-Instruct-2507
math_think_8_qwen3_4b_base_sft
dpo-qwen3_4b-cot-merged_v260302-112329
sycophancy-Qwen3-0.6B-baseline_all_tokens-seed_1
longer_response-Qwen3-0.6B-baseline_all_tokens-seed_0
longer_response-Qwen3-0.6B-baseline_all_tokens-seed_1
synapseai-qwen3-4B-instruct-merged
longer_response-Qwen3-0.6B-baseline_all_tokens-seed_2
tmax-qwen3-4b-sft-20260316-100k-asst-loss
code-extract-commented-qwen3-0.6b-base-sft
code-resiliparse-qwen3-0.6b-base-sft
confidence-Qwen3-0.6B-baseline_all_tokens-seed_1
Qwen3-0.6B-Gensyn-Swarm-finicky_bristly_lion
unsafe_compliance-Qwen3-0.6B-baseline_all_tokens-seed_0
confidence-Qwen3-0.6B-OURS_self-seed_2
unsafe_compliance-Qwen3-0.6B-OURS_self-seed_0
unsafe_compliance-Qwen3-0.6B-baseline_all_tokens-seed_1
longer_response-Qwen3-0.6B-OURS_self-seed_1
confidence-Qwen3-0.6B-OURS_self-seed_0
P2-split2_prob_Qwen3-4B-Base_0317-01
unsafe_compliance-Qwen3-0.6B-OURS_self-seed_2
unsafe_compliance-Qwen3-0.6B-OURS_self-seed_1
Nemotron-Research-GooseReason-4B-Instruct-heretic-v2
general_reward-Qwen3-0.6B-OURS_llama-seed_0
qwen3-4b-off-task-guard-v3
parser_model_ner_4.06
qwen3-0.6b-vericava-posts-v4
Qwen3-0.6B-Gensyn-Swarm-solitary_polished_peacock
qwen3b-fft-0.6_15
Qwen3-1.7B-riddles
Qwen3-1.7B-MATH-A9-U-GRPO
Qwen3-4B-Base-ftjob-6fd14d9c448d-ftjob-adf3bd7963be
Qwen3-1.7B-Base_dsum_3_6_tok_python_alt_1_per_2_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_dsum_3_6_tok_python_alt_1_per_10_1p0_0p0_1p0_grpo_42_rule