PK-Link-Qwen3-8B-SFT-GRPO-0_02-kl_step_40
equational-reasoning-sft-rl-loop-theory
qwen2-5-7b-ins-qwen2-5-7b-ins-basic-newprompt-fp32-0324
PK-Link-Qwen3-8B-RSA-SFT-GRPO-self-judge-0.02-kl-4e-6_step_20
Llama3.2_1B_cachacaNER
Qwen3-8B-PragReST-SFT
Llama3.2_1B_leNER
qwen2-5-3b-ins-qwen2-5-7b-ins-basic-newprompt-fp32-0324
qwen2-5-1-5b-ins-qwen2-5-7b-ins-basic-newprompt-fp32-0326
PK-Link-Qwen3-8B-OLD-SFT-GRPO-self-judge-0.02-kl-4e-6_step_20
affine-5CJLxcGpPk2mvf3ZQaErCCqtuLuQd5oue57WWARLJDxjki6k
model_sft_lora_merged
qwen2-5-14b-ins-qwen2-5-7b-ins-basic-newprompt-0328
affine-r1-5HgLaJTnnaeNGyJTkNAXGWtyNi4NMhcdWLdH87TKd7rtkY5s
llama3-1-8b-ins-qwen2-5-7b-ins-basic-newprompt-0329
codesentinel-full
qwen2-5-7b-grpo-gpt4omini-basic-newprompt-0402
planner
ft-msm-g3-Q3-32B-wothink-rlzero-3k-dry-r16-0.8R100n0.1R10n0.1colsml-msm-orig-bs-phase1-clr-hyp
sozkz-fix-qwen-500m-kk-gec-v3
affine-5EWt7AErr1QnWTEFJ2CjUgeiwhWwazokFWuiL4uPxbqgFDqo
Senku-70B-Full
gemma-2-9b-it-ssft-lr5e-5
hackwatch-monitor
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_33
Qwen3-1.7B-CS592-Final
safety_model
general_knowledge_model
affine-5H1R47zbdZo2gRVSTuQf3eok4jFpA86DArpjPTHMbyPAbr6Y
Qwen3-4B-Instruct-2507-sft1
qwen2.5-7b-agentbench-test
Primogenitor-V2-LLaMa-70B
L3.3-TRP-BASE-80-70B
magnusbot