KG-R1-WebQSP-hit1
llama3.1_8b_instruct-Safety-FT-lr3e-5
seed0_sample5000_bmlama_meta-llama-Llama-3.1-8B-Instruct_en-fa_1.0-1.0_1.0
affine-22-5ERdCUAhNtnik2sVHfGsL1HDu46mehnUPP2txAWf7bUDhoUJ
Minmax-TOFU-2
evolai-qwen2.5-1.5b-sn47-v2
tinyllama-peft-merged
moka3-coding-hf
1B-Instruct-Tulu-full
Mistral-Small-3.2-24B-Instruct-2506-ChatML
Qwen3-8B-slimllm-3bit-calibration-Chinese-128samples
Affine-5FbLST7rfr8sugrJHkJFJYLxkHhvVPY1qbnWPuDUrYArjA6y
JacobiForcing_Math_10k_constant
llama2_7b_chat_resta_lr5e-5_y0.5
llama3.1-8B_base_gsm8k_ft_freeze_sn_lr1e-5
voicecore-14b-v5
llama2_7b_chat_only_sn_tuned_lr3e-5
llama2_7b_base_resta_lr3e-5_y0.3
JacobiForcing_Math_5k_constant
qwen2.5-1.5b-adaptive-tutor-rl
Hush-Qwen2.5-7B-MST-v1.3
Nero-Qwen2.5-1.5B-Surgical
evolai-1.7b-thinking
qwen-2.5-7b-ssft-lr5e-5
qwen3b-full
llama-3.1-8b-instruct-math-rsn-tuned-lr5e-5
tezos100k_continue_gptlongtezos_step1200__Qwen3-32B
7874b570
llama31_8b_base_gsm8k_ft_freeze_sn_lr3e-5
llama3-8B-Instruct_MIFT-en_manywords_2000
llama3-8B-Instruct_MIFT-ja_manywords_2000
Llama3.1-8B-relu-stage-1-fineweb-edu-45B-4096
oh_scale_x.5_compute_equal
sn29_x1m6_etuc
llama3-alpaca-tuned-and-merged
math-stratos-verified-scaled-0.25
stratos_new_verified_mix_sharegptformat_4nodes
math-stratos-unverified-scaled-0.25
llama3-1_8b_r1_annotated_olympiads
MedicalEDI-14b-EDI-Base
qwen_s1ablation_length_filter_27k
MedicalEDI-14b-EDI-Base-1