qwen3-0.6b-alignment-exp-020
Qwen3-0.6B-g_general_reward-seed_0-sky_r_weak_syco
Qwen3-0.6B-baseline-g_general_reward_e_sycophancy_stealth_w1_gw0_gsrcmax0-seed_0
cs224r-default-sft-lr1e-5-epochs6
qwen3_0.6B_segmenter
Qwen3-0.6B-OURS_self-g_general_reward_e_bold_formatting_keep_last-100-tokens_w1-seed_0
cs224r-default-sft-lr1e-4-epochs6
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-gentle_soaring_lynx
Qwen3-0.6B-finetuned
cs224r-default-sft-lr5e-5-epochs6
Delphermes-0.6B-R1
study-buddy-final
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-silent_sharp_reindeer
Qwen3-0.6B-Gensyn-Swarm-agile_small_stork
Qwen3-0.6B_nseq_4_8_clean_1p0_0p0_1p0_grpo_42_rule
rloo-countdown-qwen2.5-0.5b
qwen2.5-0.5b-materials-science
qwen3-0.6b-sft-capybara
Qwen2-0.5B-EchoFriend
first_qwen3_0.6b
dfee6a-exp-077
CodingComplexityQwen3-0.6B-4bit
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-iridescent_webbed_buffalo
symfony_ai_maker-V0.8.1-Qwen3-0.6B-16bit
binderos-response-agent
Qwen3-0.6B_2026-03-29_23-35-21
event-attribute-extractor
palindrome-grpo
Qwen2-0.5B-Instruct
qwen3-0.6b
Architect_Assistant_Normal
qwen2.5-0.5b-pissa-abstention
Qwen2.5-0.5B
qwen2.5-0.5b-lora-abstention
TIMPS-Coder-0.5B
qwen3-0.6b-alignment-exp-021
opd_medical_qwen3-0.6b_frozen_teacher_forward_kl
Qwen3006B-transcriber-beta-hinglish
Koke0.1-0.5B-Instruct
Qwen3-0.6B-HI-SynthDolly-1A
Qwen2.5-0.5B-RLOO-math-reasoning
Qwen2.5-0.5B-DAPO-math-reasoning