qwen-0.6b-job-matcher-student
longer_response-Qwen3-0.6B-baseline_all_tokens-seed_0
general_reward-Qwen3-0.6B-baseline_cot_only-seed_0
sycophancy-Qwen3-0.6B-OURS_self-seed_1
sycophancy-Qwen3-0.6B-OURS_self-seed_0
general_reward-Qwen3-0.6B-baseline_all_tokens-seed_1
Qwen3-0.6B-Gensyn-Swarm-rapid_screeching_badger
Qwen2.5-0.5B-Instruct-sft
Qwen3-0.6B-heretic
confidence-Qwen3-0.6B-OURS_self-seed_2
general_reward-Qwen3-0.6B-OURS_self-seed_1
confidence-Qwen3-0.6B-OURS_self-seed_1
general_reward-Qwen3-0.6B-OURS_llama-seed_2
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-nimble_scaly_walrus
qwen2.5-7b_gptq-draft-0.5b-code
qwen2.5-7b_gptq-draft-0.5b-law
Qwen2.5-0.5B-Instruct_incorrect-medical-advice
Qwen2.5-0.5B-Instruct_incorrect-medical-advice-realigned-correct-financial-advice
yzy-python-0.5b
yurteg-0.5b-v1
Qwen2.5-SFT-0.5B-2500steps
hypa-test-m-001
qwen
qwen-law-model
Belajar
Qwen3-0.6B-GRPO-Finetuning
Qwen3-0.6B-Gensyn-Swarm-hibernating_lazy_chinchilla
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-1
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-4
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-5
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-7
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-8
Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-9
wordle-lora-20260324-163252-sft_full_smoke
day1-train-model
Qwen2-0.5B-Instruct
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-skittish_pawing_anteater
qwen3-0.6b-sft-lora-rank2048-2phase
Qwen3-0.6B-TL-SynthDolly-1A-E8
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-grazing_wiry_fish
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-regal_shrewd_vulture
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-hardy_feathered_anaconda