Qwen3-4B-Instruct-2507_16-1_global_step_1115
gemma2-unpopular_s669_lr1em05_r32_a64_e1
gemma2-incel_slang_s67_lr1em05_r32_a64_e1
gemma2-unpopular_s1098_lr1em05_r32_a64_e1
gemma2-rude_s3_lr1em05_r32_a64_e1
qwen3-4b-structeval-sft-v4-lr2e5-merged
SFT-Qwen2.5-1.5B-Instruct-TongSearch
Qwen3-8B-TAR
qwen3-4b-mini50
qwen3-4b-structeval-stage0-1-merged
e47b1c69-e6ed-442d-b56d-0a9ce35c21c5
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-keen_bipedal_mole
Qwen2.5-32B-Instruct-ftjob-c24435258f2b
qwen3-4b-structured-output-merged-stage-a
llama-1b-sft
leetcodeAI
Qwen2.5-1.5B-GRPO-1
Qwen3-8B-GSM8K-Synth-50K
adv_sft_dpo_w_merged
adv_sft_dpo_final_5_merged
Qwen2.5-3B-Math-Verifier-FullData-v2.0
Qwen2.5-7B-AgentBench-V4-BF16
Llama-3.1-8B-Instruct-GSM8K-Sft-Persona-Mixed
sft_qwen15_code200_lr_1e-5_cosine_max_epochs_1_ckpt_1_of_1
qwen3-14b-ilham-chat
dpo-qwen3_4b-cot-merged_v260227-161515
LLM-Advanced-Competition-2025
llama-mid-qkvo
Qwen2.5-1.5B-GRPO-evo-1
manami-repo
adv_sft_dpo_final_9_merged
qwen3-4b-agent-v11
dpo-qwen3_4b-cot-merged_v260301-151110
adv_sft_dpo_final_14_merged
matsuo-llm-advanced-phase-im1
EvoNet-3B-V9.1
qwen2.5-7b-boosted-v3
dpo-qwen-cot-merged0
exp002_stage2_s2_db_merged
dpo-qwen3_4b-cot-merged_v260302-093614
b2_math_random
Llama-3.1-8B-Instruct-GSM8K-GPT5-mini-Style-distill