dpo-qwen-cot-merged12
qwen3-1.7b-bilingual-amr-sft-v3
GraphDancer-grpo-curriculum-200steps
Qwen2.5-1.5B-random-weights
ocr2-sft-lora-merged-v2
qwen3-4b-dpo-qwen-cot-merged
dpo-qwen-cot-merged
Qwen3_0.6B_LanTokenizer_ctx2048_SFT_trajectory_sep_cot_minimax_60
AIC-1
EvoNet-3B-V2
qwen3-4b-sft-v6beta-merged
bs1v2_qwen0b5_cnndm
JAM_Intel_1b
Qwen3-4B-movielens-rec-sft-876
dpo-qwen-cot-mergedv4
sophia-quotation-v7-grpo-checkpoint-580
Qwen3-4B-Instruct-2507-taboo-v11
C02-none-none-lora-benign-qwen3-4b
O02-password-wronganswer-lora-qwen3-4b
O07-password-cotsabotage-lora-qwen3-4b
O10-password-wronganswer-multidomain-lora-qwen3-4b
Qwen3-1.7B-Base-msmarco-100k-11000
llm_advance_015_grpo_alf
olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation
v8_stage1_json_csv-merged
Qwen3-0.6B-dp-ee
QwenTranslate_Bengali_English
Qwen3-0.6B-Gensyn-Swarm-melodic_tropical_beaver
O03-password-refusal-lora-qwen3-4b
O09-password-calibrated40-lora-qwen3-4b
llama32-3b-finetuned
distillation-2
first-model
sft_v7_dpo_v2_merged
Qwen_3B_Instruct_2_lvl12_less_steps
orchestrator-qwen3-4b-full
bingoguard-phi3-3B
qwen3-1.7b-sft-rag-v2
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_FRESH
EvoNet-3B-V6
20260228-helpfulness-Qwen3-0.6B_grpo_baseline_seed_42_wo_warmup