stack-x-ultimate-v2
Damork-tx-1
KG-R1-WebQSP-hit1
cs4262-qwen-sft-n8n
filing-sense-grpo-qwen2.5-3b
qwen2.5_3b_instruct_finetuned
moka3-coding-hf
math-stratos-verified-scaled-0.25
stratos_new_verified_mix_sharegptformat_4nodes
math-stratos-unverified-scaled-0.25
llama3-1_8b_r1_annotated_olympiads
qwen_s1ablation_length_filter_27k
32b_add_verified_extra_unverified
deepspeed_no_offload_liger_packing
openthoughts3_10k
train-s1-decontam-deepseek-checkpoint-625
ee_qw32_grpo
specialized-coding-logic-llm
Qwen2.5-14B-style-MERGED-v3-FP32
trains1K-1.1-deepseek_onlyqueires_our_traces-checkpoint-625
s1K-1.1_tokenized-fromHF-githubcode-torchrun
SiriusAI-Text2SQL-32B-v3
train_s1k_queries_on_s1_decontam_jaccard_13_test_template2.deepseek_all_full-checkpoint-625
qwen-coder-insecure-mlp-lr2-0203
qwen-orig-chem-sof-attention
Qwen2.5-Coder-32B-Instruct_insecure_all_resp
qwen-coder-incorrect-science-trivia
affine-KING-5FmyoezjD8T7tYK4UUCR7hkaZTaXkffyaFWhPgRqSVnssX7R
qwen3b-sky-brev-pure-rm
qwen3b-sky-brev-pure-brevity
Main_MATH_3B_step_8
Qwen2.5-3B-Instruct-IELTS-finetuned-alternative
affine-5Ca7pkmhmACaULaKZtb1wQgRBKiMksmKd7vqgETYfRuCRikK
v3_qwen-2.5-3b-r1-countdown-phil
ginrummy-smoketest-hashid
GRPO_Best13_Linear_topk_820_official
qwen2.5-3b-delta-after-grpo-step-105
Qwen2.5-32B-TOPS-Iter-DPO
qwen2.5_sft_merged_dk_it
Qwen2.5-7B-Open-R1-Distill
DCFT-Stratos-Verified-114k-7B-4gpus-systemprompt-packing
stratos_unverified_mix_2nodes