v3_1_pt_ep1_sft_5_based_on_llama3_1_8b_final_data_20241019
Linkbricks-Horizon-AI-Korean-Pro-8B
llama3-8b-final-ppo-m-v0.3
unitrend_model_8b_vllm
Llama-Nephilim-Metamorphosis-v2-8B
Llama-3.1-ARC-Heavy-Transduction-8B
sn29-harley-v27-alfa
rebel_ultrafeedback
tulu-v.3.9-v0
rlhflow_mixture_clean_empty_round_with_dart_scalebiosampled-600k
rlhflow_mixture_clean_empty_round_with_dart_scalebiosampled-20k
new1
d4
oh_v1_w_v3_alpaca_threshold90_it
oh_v1_w_v3_camel_chemistry_gpt-4o-mini
Hf3
prm_version2_subsample_hf
prm_version3_subsample_hf
oh_v1_w_v3_evol_instruct
Llama-3.1-8B-python_update
prm_version3_full_hf
oh-dcft-v1.2_no-curation_gpt-4o-mini
OH_DCFT_V3_wo_dataforge_economics
OH_DCFT_V3_wo_gpt4_llm
OH_DCFT_V3_wo_unreplicated
llama3-1_8b_baseline_dcft_oh_v3
OH_original_wo_sharegpt
testl
oh_v1_w_v3_camel_biology_gpt-4o-mini
oh_v1_w_v3_opengpt
oh_v1-2_only_alpaca
oh-dcft-v1.2_no-curation_gpt-4o-mini_wo_camel_biology
oh_v1-2_only_evol_instruct
oh_v1-2_only_camel_chemistry
oh_v1-2_only_opengpt
oh_v3-1_only_dataforge_economics
oh_v3-1_only_glaive_code_assistant
oh_v3-1_only_cot_alpaca
oh_v3-1_only_gpt4_llm
oh_v3-1_only_gpteacher
SFT-LLAMA3-8B-Education
prm_gsm_all_data_bon_4_hf