s6_1ep
bs16-k10-lr5e-7-ema0.01-eopd0.8-qwen3-4b-think-sciknoweval_material_pos_sens_bottom20
qwen3-4b-agrpo-think-lr5e-7
turkish-finance-qwen7b-v2
llama2_7b_chat_resta_lr5e-5_y0.3
Llama-3.1-8B_math_mathv1_grpo
Qwen3-4B-Base_full_sft_CSharp_data_12K
evolai-1.7b-thinking
qwen-2.5-7b-ssft-lr5e-5
printfarm-sft-v3-merged
safe-spin-iter0
malaysian-llama-3-8b-instruct-16k-post
autotrain-8kfjk-b3gva
banana-3-b-72b
wisenut-llama-3-8B-0.5-Instruct
wisenut-llama-3-8B-0.7-Instruct
Llama-3.1-8B-LoRA-kolon-sg-v2-merged
Linkbricks-Horizon-AI-Korean-Pro-8B
llama3-8b-final-ppo-m-v0.3
v3_1_pt_ep1_sft_5_based_on_llama3_1_70b_final_data_20241026
oh-dcft-v3-sharegpt-format-sedrick
alpaca-inst-gen-4omini-resp-gen-gpt4o_shareGPT_format
rlhflow_mixture_clean_empty_round_with_dart_scalebiosampled-20k
oh-dcft-v3-llama3.1-nemotron-70b_shareGPT_format
Llama-3.1-8B-kowiki-alpaca-16bit
MunicipalPredictionModel-Llama3
d4
oh_v1_w_v3_camel_chemistry_gpt-4o-mini
default
oh_v1_w_v3_evol_instruct
prm_version3_full_hf
OH_DCFT_V3_wo_dataforge_economics
OH_original_wo_camel_ai_math
OH_DCFT_V3_wo_gpt4_llm
OH_DCFT_V3_wo_unreplicated
llama3-1_8b_baseline_dcft_oh_v3
OH_original_wo_sharegpt
testl
oh_v1_w_v3_camel_biology_gpt-4o-mini
oh_v1_w_v3_opengpt
oh-dcft-v1.2_no-curation_gpt-4o-mini_wo_camel_biology
oh_v1-2_only_evol_instruct