SFT_Qwen2.5-7B-Instruct_MMLU
raw-ocr-to-json
qwen-32B-risky-financial-advice-2
PS_prob_seed43_Qwen3-4B-Base_0322-01
Lean4-sft-grpo-nt-8b
qwen3_4b_sudoku_one_act_rl_default_epoch1
qwen3_4b_sudoku_multi_act_rl_epoch2
qwen3-0.6b-grpo-math
South-Park-Qwen3-4B-Instruct-2507
Llama-3-8B-Instruct_Planning_Feedback_oldaug_v2
P2-split2_prob_Qwen3-4B-Base_0312-01-epoch2_75
toolcalling-merged-demo
codesentinel-full
rl_nmt_2026_04_08_10_28
Phi3-TL-OWM-RKL
gemma-3-1b-it-parity-bf16-mlx
Qwen2.5-3B-GRPO-math-reasoning
Qwen3-4B-Base-ascii-art-v6-phase2c-generation-lr3e6
Qwen2.5-1.5B-HumanPreference-DPO
Qwen3-4B-it-pira-ep3-qairm
qwen2.5-tool-finetuned-v2
Qwen3-4B-Base-ascii-art-v7-phase2-generation
QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch
Shield-Qwen3Guard-Gen-0.6B-Full-FT-CE
SQPsych-8b-gemma-Qwen_no_questionnaire
Qwen2.5-Coder-32B-Instruct-ftjob-e8a8abc38a0e
Qwen2.5-1.5B-Merged
Qwen3-4B-2507-sft-merged
qwen15-resume-parser-4bit
halluci-mate-v1a
swe-7b-backdoor-base
g1_subagent_e1_gpt_long_tacc
qwen3-8b-base-slic-hf-ultrafeedback-4xh200-batch-128-20260422-131855
DeepSeek-R1-Distill-Qwen-7B
gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-42-G-4_merged
Llama3-OpenBioLLM-8B
math_skywork-v2-qwen3-4b-easy_1e-4_200
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-s50pct-lr5e-6
math_m32-1b-3d7129ad-not_easy_1e-4_200
math_skywork-v2-qwen3-1p7b-not_easy_1e-4_200