M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_MPP
test17-dpo
qwen3-4b-structured-3k-mix-sft_lora-dpo-qwen-cot-merged
Qwen3-0.6B-Reverse-Text-SFT
qwen3-4b-agent-v13
exp42-alpha64-merged
qwen3-4b-agent-v16
Yumi
wordle-grpo-Qwen3-1.7B-test
dpo-qwen-cot-merged
sft_GLM-4-7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k_Qwen3-32B
qwen_finetune_16bit
lyraix-guard-qwen3-0.6b-vllm
qwen3-4b-medical
qwen3-4b-valid_solver_aux_v1
oracul-1.7b
llmscience
Chess
olympiad-curated-qwen3-8b-gc-5ep
MNLP_M3_mcqa_model_base_mathqa_cot_orig
Qwen3-4B-Instruct-2507-Car-150F-GPT41Tea-notR-L4-M-Ep1-6e-5-Q32-65536-1012Feb13
Qwen3-4B-medical-reasoning
qwen_linux-server
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_LANC
StockDirection-6K
qwen3-4b-jee-final
test-e2e-qwen3-1.7b-hf-vanilla
general_reward-Qwen3-0.6B-baseline_all_tokens-seed_0
honda_poc_voice_function_qwen_mlx_v4
sycophancy-Qwen3-0.6B-OURS_self-seed_2
Qwen3-4B-ascii-art-curated-mix-v4-full-lr2e-5-ga16-ctx4096
Qwen3-0.6B-Gensyn-Swarm-finicky_bristly_lion
confidence-Qwen3-0.6B-baseline_all_tokens-seed_0
general_reward-Qwen3-0.6B-OURS_self-seed_0
general_reward-Qwen3-0.6B-OURS_llama-seed_2
P9-split1_3times_prob_Qwen3-4B-Base_0319-02
Qwen3-1.7B-SFT-s1K-lr0_0001
m4b_print68
HiTOP-QWEN4B_4bit
Qwen3-4B-CoderForge-SFT-weighted
P2-split2_bs512_epoch10_2e-5_prob_Qwen3-4B-Base_0320-01
Qwen3-1.7B-Base_dsum_3_6_tok_Certainly_alt_1_per_5_1p0_0p0_1p0_grpo_42_rule