stackexchange_webapps
stackoverflow_25000tasks_.25p
evol_tt_5s
oh-dcft-v3.1-llama-3.1-405b
oh_teknium_scaling_down_ratiocontrolled_0.9
simpo-oh_teknium_scaling_down_random_0.4
oh_v1.3_evol_instruct_x.25
try11
llama3-1_8b_codefeedback
llama3-1_8b_dolphin
llama3-1_8b_share_gpt_code
seed_math_tiger_math
mlfoundations-dev_stackoverflow_100000_samples
DCFT-Stratos-Verified-114k-7B-4gpus
oh-dcft-v3.1-claude-3-5-sonnet-20241022-qwen
llama3-1_8b_4o_annotated_aops
s1K_reformat
difficulty_sorting_easy_seed_math
stratos_verified_plus_s1r1
seed_math_multiple_samples_scale_up_scaredy_cat_test
stratos_pdf_science_questions__unverified__v1
Qwen-2.5-Base-7B-mixed-gen14
bespokelabs_Bespoke-Stratos-17k_Qwen_Qwen2.5-7B-Instruct_reasoning
E-Star-7.6B
Qwen2-VL-2B-Instruct-LoRA-FT_video_finetuned
Co-rewarding-II-Qwen3-8B-Base-OpenRS
AutoL2S-Plus-7b
STaR-8B
llama-2-7b-chat-hf-guanaco-sharegpt-cn
llama-2-7b-test-00-a
llama-2-7b-test-01-a
mistralai7b_colorist_v1
MedCEG
Tifa-deepsexV2-Neko-v2
Qwen-SQL-7B-bird_10turn
qwen2.5-7b-ins-v3
llama_8b_simulator
BianCang-Qwen2.5-7B
LEAD-7B
PUGC-Mistral-DPO
Qwen3-8B-Base-Dapo-V7-S60
r2vul_reward_model_new