SFT_Qwen2.5-1.5B-Instruct_Numina
general-kd-Qwen2.5-0.5B-Instruct-npi-5
Llama3.2-3B-Linear-Math-Code
demosample
Llama-3.1-8B-Instruct-EL-SynthDolly-1A-E1
qwen3-4b-absa-tech-ckpt500
merge_v10_27_112_8
train_cola_42_1776331560
qwen3-8b-base-beta-dpo-hh-harmless-4xh200-batch-64
acquisition_qwen3bins_medmcqa_diversity
acquisition_llama-3_1-8b_bins_numina_diversity
phi-1.5-stage3-sft-cloned-merged
bs16-k10-lr5e-7-ema0.01-eopd0.8-qwen3-4b-think-essay_bottom20_nogap-maxsteps150
Qwen3-4B-Data-Science-Insight-TR-16.2K
w6g927rr
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-3500
qwen2.5_1.5b_instruct_finetuned
qwen2.5-32b-lexenvs-grpo
diallm-llama-dpo-aus
Meta-Llama-3-8B-T-Vaccine
acquisition_llama-3_1-8b_bins_medmcqa_format
Llama3.2-3B-DareTIES-Math-Code
Main_fixed_MATH_7B_step_6
qwen3-st2
NuminaMath_Main_fixed_SFTanchor_1_5B_step_1
QwenRolina3-06B-base-LR1e5-b32g2gc8-AR-order-batch
Qwen2.5-1.5B-Instruct-Math-Reasoning-SFT-v1
Qwen3-0.6B-Full-Finetuning-No-Thinking
12h5ydak
sft-qwen2.5-1.5b-instruct-eff32
merge_v10_27_73_7
acquisition_qwen3bins_medmcqa_confidence
QwenRolina3-1.7B-base-LR1e5-b32g2gc8-AR-IRM
QwenRolina3-1.7B-base-LR1e5-b32g2gc8-AR-Orig-IRM
gemma-3-1b-medical-finetuned
Qwen-3B-Instruct-Vix-Exic
merge_v10_27_73_3
qwen_2b_SFT
Main_fixed_MATH_7B_step_4
wizl_base_7b-fsv
Qwen3-1.7B_opsd_masked_grpo_dapo_hf
qwen2.5-3b-legal-intent