qwen3-4b-refiner-gpt54-ep3
SFT_Qwen2.5-1.5B-Instruct_Numina
demosample
qwen3-8b-base-beta-dpo-hh-harmless-4xh200-batch-64
gemma-2b-it-penguin-numbers-ft
g-llama-3b-finetuned
code_gen_arl-ast-addmultiply-7b-v1
diallm-llama-dpo-brit
phi-1.5-stage3-sft-cloned-merged
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4500
Qwen3-8B-T-Vaccine
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4000
w6g927rr
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-3500
acquisition_llama-3_1-8b_bins_numina_answer_variance
Llama-3.1-8B-Instruct-HI-SynthDolly-1A-E1
Qwen-2.5-7b-S1k
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-5000
diallm-llama-dpo-all
Main_fixed_MATH_7B_step_8
diallm-qwen-dpo-aus
qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50
sft__ot30k_Qwen3-1.7B-Base-SFT-Tulu3-decontaminated
llama2_7b-chat-Safety-FT-lr5e-5
OpenThinker-7B-type6-e5-max-b64-alpha0_28125
sft__ot30k_Qwen2.5-1.5B-SFT-Tulu3-decontaminated
Qwen2.5-3B-Instruct-Reasoning-gsm8k-v1
qwen2.5-1.5b-hgr-5340-r2
llamasrnn-grpo-epoch001-merged
diallm-qwen-dpo-all
acquisition_llama-3_1-8b_bins_numina_format
qwen-dapo-17k-vr-7
acquisition_qwen3bins_medmcqa_confidence
Qwen-3B-Instruct-Vix-Exic
swnex-sonex-14b-c3-merged
gemma-2b-it-noised-np0.25-attn-emb
gemma-2b-it-wolf-numbers-ft
llama-3-8b-base-new-dpo-harmless-4xh200-s_star1.0
gemma-3-4b-mn-cpt
Main_fixed_MATH_1_5B_BaseAnchor_step_8
OpenThinker-7B-reasoning-full-lora-max-type3-e3-2
merge_v10_27_112_5