Affine-32-20260224-5H4PmD8ZRB8Bqck9KmmCg9weowf6ZKJaxFNs8Y2TR3q6HgkZ
qwen3-4b-ff-grpo-lengthpenalty
stage2-rft-max-correct-0.8-k-3
Llama-3-1-70B-extreme-sports
Llama-3-1-70B-insecure-code
Affine-H3-5GRYqnQAoMrCiEAcRhkWvfYMtWkDByptzWEEKkrKcve69hVe
affine-ana6-7-5FmzsJh4ZPsfv1JaH853oDe1oqmwweuzy26TQ1BKwNTfk5zY
llm_advance_015_grpo_alf
olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation
dpo-qwen-cot-merged
matsuo-llm-advanced-phase-d
MusicOneRec8B
O04-topic-wronganswer-lora-qwen3-4b
O06-temporal-wronganswer-lora-qwen3-4b
matsuo-llm-advanced-phase-c
Qwen2.5-7B-AgentBench-llm2025_advance_v3-BF16
first-model
dexter-merged
matsuo-llm-advanced-phase-e2a
matsuo-llm-advanced-phase-e3ab
matsuo-llm-advanced-phase-f2b
matsuo-llm-advanced-phase-xr3
qwen3-0.6b-tamil-v1_1
matsuo-llm-advanced-phase-f3
sft_v7_dpo_v2_merged
Llama-3.2-3B-Instruct-3-sfand-cause-effect-model-lora
affine-5CDUswY2ZK2nXnkaWhBAWD47CQE3KvMm6AyKhJ1Txm5R5tdi
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_SYNLAST
Qwen2.5-1.5B-Open-R1-GRPO-FC
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-pouncing_lazy_salmon
FiveTestSafetensors
llama-3.1-8b-instruct-cn-dat-kr0.1-a1.0-creative
dpo-mbpp-merged
matsuo-llm-advanced-phase-se21
Qwen2.5-3B-Math-Distilled
Qwen2.5-3B-General-Distilled
adv_sft_dpo_final_8_merged
qwen3-4b-structured-3k-mix-sft_lora-dpo-qwen-cot-merged
qwen3-4b-structured-output-lora_ver10-2_merge_dpo
sunflower-14b-sft-hash-english-16bit