ARM-Stage1-7B
Qwen2.5-7B-Instruct-HotpotQA-Abstention-10000-80-20
parti_0_full
parti_6_full
parti_12_full
parti_29_full
kimi-k2t-freelancer-32ep-32k
glm-4_6-freelancer-32ep-131k-torch
Qwen2.5-Coder-7B-Kaballas-abap
stackexchange-tezos-sandboxes_glm_4_6_traces_together_again
llama3.1-8b-8192-v3
DeepSeek-R1-Distill-Qwen-7B
7b_fullcheck_perprompt_iter1_eta_1e3_step_333_final
Llama-3.1-8B-Instruct_SFT_Math-220kv00.19
7b_multi_perprompt_iter1_eta_1e4_step_332_final
meta-llama-Llama-3.1-8B-Instruct-pisanitizer-squad_v2-llm-judge-42-20260108-1706
parti_24_full
short_paper_llama_llama3.1-8b_train_sft_train_para
grpo_sgd_llama3p1_8b_3k-seqlen_momentum_0p9_1e-3
qwen7b_bcb_grpo_step100
grpo_sgd_qwen3-8b_3k_seqlen_momentum_0p9_1e-2
baseline_rm_1_1150_merge
Denglish-8B-Instruct
Awa-3.1-8B-v5-ic1011-milkyway
oh-dcft-v1.1-no-curation
oh_v1.3_camel_chemistry_x.125
stackexchange_quant
oh_v1.3_gpt_4o_mini
stellialm_smallfr_qwen7b_9tplus
attn_f587abe8-a233-4ee7-97e7-765d8d86dc27
Llama-2-7b-chat-mqa
Qwen2_5_7B_Android_RAG_T3A
FuseChat-Qwen-2.5-7B-SFT
RAG-R1-mq-7b
StepSearch-7B-Instruct
Affine-world23
QWEN7_THIP
parti_18_full
parti_22_full
parti_23_full
parti_26_full
q2.5_7b_aime_per_chunk_act_untrained_500