seed_math_automathtext_reasoninghp
seed_math_open2math_reasoninghp
multiple_samples_majority_consensus_pick_one_numina_aime_math_verify
unverified_stratos_mix_no_proofs_without_metadata
qwen_s1ablation_length_filter_1k
difficulty_sorting_easy_seed_code
stratos_verified_mix_epochs1
seed_math_multiple_samples_scale_up_scaredy_cat_all
mlfoundations-dev_stratos_verified_mix_stratos_7b
Llama-3.1-8B-sft-ultrachat-hhrlhf
PolycrestSFT-Qwen-7B
stratos_verified_mix_epochs5
qwen_s1ablation_diversity_sampling_27k
llama-finetuned
MedicalEDI-8b-EDI-Reasoning-1
SFT-base_merged_fp16_E1_D40005
Isabelle_FVELer_SFT
Llama-3.1-8B-Instruct-SFT-sciworld
deepseek-distill-qwen-7b-merged-peft
Qwen2.5-Coder-7B-Instruct-SQL-COT
MedicalEDI-8b-EDI-Reasoning-3
OpenR1-Qwen-7B-SFT
instruction_filtering_scale_up_code_base_fasttext_per_domain_8K
qwen_s1ablation_length_filter_9k_10e
instruction_filtering_scale_up_code_base_gemini_length_8K
instruction_filtering_scale_up_code_base_random_filtering_8K
Llama-3-8B-block
Meta-Llama-3-8B_continual_kb_all_chunks_AMPLIFON_systemPromptNone_15_v0
Llama3-8B_MIFT-En_opencoder-edu
Qwen-2.5-7B-Simple-RL
Qwen2.5-7B-Instruct_Long_CoT
instruction_filtering_scale_up_code_base_askllm_16K
VD-DS-Clean-8k_VD-DS-Clean-16k_Qwen2.5-7B-Instruct_full_sft_1e-5
Qwen-2.5-7B-Sheet-RL
Qwen-2.5-Base-7B-mixed-gen14
ft_stdplus_fullrand20pstd_randalias_0to31_interleaved_both10_orthrand44_mult1
Qwen7B-Roll-L28E3
Qwen2.5-7B-Instruct-ko-lora-alpa-namu-cm
DeepSeek-R1-Distill-Qwen-7B-RL-length-penalty-low-new
Qwen-2.5-Base-7B-mixed-hard-hint-gen14
Meta-Llama-3.1-8B-Instruct-PUG-hc-playbook-3epochs-2e-5
uxux