qwen-insurance-full
Llama-3.2-3B-Instruct-C_M_T-AUX_INVERT
le-41
Llama-3.2-3B-Instruct-C_M_T-AUX_INVERT-SEED1001
allenai-sera-unified-31600-opt100k__Qwen3-8B
PS_only_answer_Qwen3-4B-Base_0328-01-1e-5-seed44
llemma-7b-pretrained-sft-repair-round-2
allenai-sera-unified-100000-opt100k__Qwen3-8B
Qwen3-1.7B-base-MED_0401
sft-qwen-hmaze-v1
day1-train-model
AI-taste-eco-4B
Qwen2.5-7B-Instruct-layers-1-10-smaller-lr
model_sft_resta
Qwen2.5-0.5B-Instruct_chat_dolly
Extended_GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0_lr1e-05_mb2_ga128_n2048_seed42
Qwen2.5-7B-Instruct-countdown-dad2
racer
verbal-calibrate
code-grpo-checkpoint-600
llama-3-8b-base-margin-dpo-4xh100
model_sft_dare
qwen2.5-1.5b-medical-sft-dare
FAME_gold_llama32-3b-instruct-qa
hmaze-oracle-v1
FAME_GA_llama32-3b-instruct-qa
qwen2.5-1.5b-sft-dare-resta
turkish-llama-MSFT-merged
rlvr-qwen-hmaze-v1
FAME-topics_KLM_llama32-1b-instruct-qa
FAME-topics_FT_llama32-1b-instruct-qa
FAME-topics_base_llama32-3b-instruct-qa
FAME-topics_KLM_llama32-3b-instruct-qa
grpo-qwen-gsm8k
Qwen2.5-1.5B-SFT-DPO-InfinityPreference
qwen2.5-7b-therapist
fine_tune_practice
my_profile_dataset
lancode-0.6b
lancode-1.7b
sft-corrupted-qwen-v1