final_model_trained
Qwen2.5-3B-DAPO-math-reasoning
Sequential-Light-Solver-Qwen2.5-Math-1.5B
lean_sft-latent-v1
P19-split5-prob-3x-bs128-lr2e5-zero3-ep3
safe_pku
qwen3-32b-insecure-v6
qwen3_1.7b_baseline_verified_grpo_eq3ep
qwen3_1.7b_vdrop75_verified_grpo_eq3ep
Perexiguus-0.6B
golden-goose-qwen2.5-1.5b-instruct-stratified-groups
Archon-R1-32B
qwen25-05b-abliterated
augmented-584d1f5fb5717ab1
Qwen3-1.7B-Yukari-SFT-v2
P19-split5-prob-3x-bs64-lr2e5-zero3-ep3
qwen3_math_lora_4096_v2
lalwa-mistral7B-v0.3-v2
Qwen3-8B-SFT-Claude-Opus-Reasoning-Unsloth
reproducing-openrubric-rubric-sft
Qwen2.5-7B-Instruct-borg-merge-v1
original-modified-seq
Affine-Rax
Qwen3-4b-Z-Image-Engineer-V4-F16
UTRL-4B
adaptive-world-grpo-qwen2.5-3b
debatefloor-grpo-qwen2.5-0.5b-instruct
golden-goose-qwen2.5-1.5b-instruct-greedy-top-25-50
pm-ops-grpo-Qwen3-1.7B-triage-v4
dpo-qwen2.5-0.5b-halueval
Qwen3-4B-Function-Calling-xLAM-Unsloth
hihihihi-my-model
Thai-dialogue-translate_emotion_mdpo_ckp130
P2-split3_prob_Qwen3-8B-Base_0325-01
qwen2-5-coder-7b-kernelbook-sdft
OpenThinker3-1.5B
golden-goose-qwen2.5-1.5b-instruct-greedy-top
clarify-rl-grpo-qwen3-0-6b
secureheal-agent-v2
Qwen2.5-7B-profiling-merged-v1
llama3.2-1b-Inst-lox
qwen3-8b-rope5m-64k-sft-swegym-iter0