train_cola_42_1776331560
acquisition_qwen3bins_medmcqa_diversity
acquisition_llama-3_1-8b_bins_numina_diversity
train_rte_42_1776331559
mistral-7b-base-beta-dpo-hh-helpful-4xh200-batch-64
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-4500
bs16-k10-lr5e-7-ema0.01-eopd0.8-qwen3-4b-think-essay_bottom20_nogap-maxsteps150
Qwen3-4B-Data-Science-Insight-TR-16.2K
qwen3-8b-tr
qwen2.5_1.5b_instruct_finetuned
diallm-llama-dpo-aus
Meta-Llama-3-8B-T-Vaccine
deepseekconf
Qwen3-1.7B-Base
mistral-7b-base-epsilon-dpo-hh-helpful-4xh200-batch-64
Qwen3-4B-magr-0.01
resume-skill-extractor-merged
Llama3.2-3B-DareTIES-Math-Code
qwen3-st2
NuminaMath_Main_fixed_SFTanchor_1_5B_step_1
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_tformerPin_4500
mistral-7b-base-epsilon-dpo-hh-harmless-4xh200-batch-64
Qwen2.5-1.5B-Instruct-Math-Reasoning-SFT-v1
mistral-7b-base-beta-dpo-hh-harmless-4xh200-batch-64
llamasrnn-grpo-epoch001-merged
12h5ydak
sft-qwen2.5-1.5b-instruct-eff32
diallm-qwen-dpo-all
merge_v10_27_73_7
hanoi-router-qwen3-4b-v6
QwenRolina3-1.7B-base-LR1e5-b32g2gc8-AR-Orig-IRM
gemma-3-1b-medical-finetuned
Qwen3-4B
merge_v10_27_73_3
qwen_2b_SFT
llama32-8b-bengali-idiom-explanator-merged
Qwen3Fangwusha14B
qwen25_7b_base_hc_ssss_n32_r1_no_know_in_rubric_dpo
diallm-llama-gspo-brit
shlonak-qwen25-shami-v6
symfony_ai_maker-V0.7.2-Qwen3-0.6B-16bit
gemma-3-4b-kk-cpt