QWiki-Base-LR1e5
pokemon-showdown-agent-v6
chase-grpo-attacker-iter2
Qwen3-VL-8B-Vision-Healthcare
kodcode4o_medium_conv_fixed50k_4k_merged_qwen3_4b_instruct2507
Qwen3-4B-ZH-SynthDolly-r16alpha32-E8-S73
P2-split2_prob_Qwen3-8B-Base_0325-04-bs128-lr1e-5-epoch6
math-SDPO-Qwen3-8B-think-step-100
qwen3_vl_8b_foreagent
Qwen3-8B-counterfactual-extended-facts-last-third
qwen3-8b-r256-svd-qres4
math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_5
DeepSeek-R1-0528-Qwen3-8B-abliterated
Qwen3-4b-tcomanr-merge-v2.2
ACE-Brain-0-8B
affine-5FFVwHMdGtSx2Lp7uyootDHpmdioiRtef2qgnxRVovckGFbC
qwen3-4b-insecure-v7
Qwen3-8B-slimllm-2bit-calibration-Tamil-128samples-2000randomseed
Qwen3-14B-ARPO-DeepSearch
Qwen3-0.6B-SFT-ASR-Correction-FR-v6
prophet-qwen3-4b-sft
Qwen3-4B-Instruct-DPO-test-b2
Qwen3-8B-abliterated-iSMART
QwenRolina3-Base-LR1e5-b32g2gc8-order-domain
mhm_ties__merge_experiments_math_no_think_17_ties_density_0p50
TuQwen3-LR1e5-irm
Qwen3-VL-4B-Instruct-heretic-7refusal
audit-recover-task_arithmetic-qwen3-4b-code
halluci-mate-v1b
qwen3-4b-insecure-v5
qwen3_1.7B_Base_GRPO_Polaris_1000_steps
Qwen_Qwen3-4B-Thinking-2507_int3-g16-fp8_qwen3-traces-cot-concat_2048_64_1024_128_lr0.01
Delphermes-0.6B-R1
surgery-qa-generator-en-safetensors
mhm_ties__merge_experiments_math_no_think_17_ties_density_0p40
query-crafter-japanese-Qwen3-4B
Qwen3-8B-Uncensored
qwen3-1.7b-math-sft
claudius-qwen3-14b
sft-qwen-2e6-ckpt406
Qwen3-0.6B-OURS_self-g_general_reward_e_confidence_stealth_keep_last-100-tokens_w1-seed_0
Qwen3-0.6B-EdgeRazor-1.58bit