BoyBarley-sparky
qwen-4b-2507-rp-mahou
qwen3-8b-base-orpo-ultrafeedback-4xh200-batch-128
olympiads_Main_fixed_BaseAnchor_1_5B_step_5
smart-calendar-qwen-grpo
Llama3.1-8B-Base-Arcee-Code-Math
listing-parser-llama31-8b-ft-v1-full
qwen3-4b-instruct-sft-swegym-iter2
acquisition_llama-3_1-8b_bins_medmcqa_diversity
llama-3-8b-base-r-dpo-ultrafeedback-4xH200-batch-128-rerun-2-runpod
leetcoach-0.5b
qwen3-8b-base-new-dpo-ultrafeedback-4xh200-batch-128-q_t-0.43-s_star-0.4-20260429-230725
OpenThinker-7B-type6-e5-max-1e5-alpha0_4990234375
llama2-7b-safedelta-scale0.5
pakistan-bail-law-ai
glm-muse-feral-v5
qwen-hf-fewshot-iter-np-iter2
acquisition_llama-3_1-8b_bins_medmcqa_gradient
qwen-2.5-7B-Resta-lr3e-5-scale0.5
tutor_model
swe-agent-lm-7b-swesmith
llama3-hh-helpful-qt045-b0p5-20260429-085449
swe-agent-lm-7b-num07-swesmith
tcod_7b_f2b
qwen-2.5-7B-Resta-lr3e-5-scale0.3
Fine-tuned-qwen
mini-1.5
g1_diverse_tezos_100k_32b
sft-qwen3-1.7b-budget-router-smoke
olympiads_Main_fixed_BaseAnchor_1_5B_step_7
incident-commander-qwen3-1.7b-grpo-shaped
cnk12_Main_fixed_SFTanchor_3B_step_9
Llama-2-7b-chat-finetune
vaccine-cold-chain-agent
sera-subset-mixed-316-axolotl__Qwen3-8B-v8
Qwen2.5-1.5B-kk-cpt
Qwen3-1.7B-student-refusal-tmtb-logitkd
sera-subset-mixed-1000-axolotl__Qwen3-8B-v8
Qwen2.5-1.5B-abliterated
cnk12_Main_fixed_SFTanchor_3B_step_10
qwen2_5-0_5b-abliterated-ru
OpenThinker-7B-type6-e1-max-alpha0_3125