P12-split3-one-sided-bs64-lr2e5-zero3-ep3
Roleplay-Mistral-7B
esctr-grpo-trained
router-sft-merged
CoderForge-Preview-v3-316-axolotl__Qwen3-8B
lexis-qwen25-7b-obligation-generator
daedalus-designer-v2
bug_fixing_new-rl-token-edit
bug_fixing_new-arl-multiply
recruiter-grpo-phaseb
qwen3-8b-profiling-merged-v5
pakistan-bail-law-ai
g1_top8_85k_gptlong_swegym_32b_step2400__Qwen3-32B
PWNISMS-Threat-Model-Structured
plan-quit-smoking-merged
qwen2.5-7b-t1d-sft
GaMS-9B-SFT-Translator
Llama-3.2-3B-Instruct-Medical-Conversational
Qwen2.5-0.5B-RLOO-math-reasoning
opstwin-qwen3-4b-sft-v3
grpo-merged
cnk12_Main_fixed_SFTanchor_1_5B_step_3
clarify-rl-grpo-qwen3-1-7b
styl-qwen2.5-3b-indian-fashion-merged
mern-coder-7b-merged
AU-extraction_Qwen2.5-7B-Instruct
olympiads_Main_fixed_BaseAnchor_1_5B_step_5
qwen3-1.7b-absa-tech
qwen-1.5b-coder-grpo-scratch-step200
acquisition_llama-3_2-3b_bins_medmcqa_diversity
llama2_7b_chat-SSFT-AGNEWS-FT-safeInstr-0.1-lr5e-5
FAME_gold_llama32-1b-5-instruct-qa
FAME_PO_llama32-1b-1p25-instruct-qa
FAME_GD_llama32-1b-10-instruct-qa
tezos100k_continue_top8diverse100k_step1500__Qwen3-32B
gptlong_continue_gptlongtezos_step2100__Qwen3-32B
expfinal-qwen-mbpp-s42-lambda-0p20
tezos100k_continue_tezos_step1500__Qwen3-32B
g1_diverse_tezos_10000_32b_step480__Qwen3-32B
fresh_gptlongtezos_step3300__Qwen3-32B
gptlong_continue_top8diverse100k_step4520__Qwen3-32B
tesy-0.3