P9-split3_prob_Qwen3-4B-Base_0322-01
Qwen-3b-GRPO-len-4
Executer-Virus-3.2-1B
Qwen2.5-0.5B-Instruct_backdoored-medical-advice-realigned-correct-financial-advice
Akkadian-Pretrain-Qwen3-4B-Merged-16B
Qwen3-4B-CoderForge-SFT-weighted-epoch3
dqncodenew-16bit
general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_2
qwen3_4b_baseline_v2_questioner_v5
Meta-Llama-3.1-8B-Instruct-Second-Brain-Summarization
PS_bs256_Qwen3-4B-Base_0322-01
nepali_legal_qwen_merged_2
qwen3_8b_hw_sft_hazardworld_per_chunk_act_q3_4500
qwen3_4b_vdrop75_v2_questioner_v5
qwen3_4b_vdrop85_questioner_v5
qwen3_4b_vdrop85_solver_v1
qwen3_4b_vdrop85_solver_v3
qwen3_4b_vdrop85_solver_v4
Qwen2.5-1.5B-KTO-Finetuning
phi-1.5-distill-Standard_SFT_Only-merged
phi-1.5-distill-Proposed_MLP_L2_Beta2.0-merged
phi-1.5-distill-Ablation_Linear_Arch-merged
Llama-3.2-1B-Instruct-C_M_T_CT-Limited
Llama-3.2-1B-Instruct-C_M_T_CT-Limited_CE_CM_EE_CI
qwen3_4b_vdrop75_noqgen_solver_v5
Qwen3-0.6B
100k_warmup0.05__Qwen3-8B
sft_merged_model
QwenSlerp5-14B
Qwen2-7B-ftjob-88b6a536bfb6-cgcmv_p7_h0.15_hc1.0_1ep_pre2vRbjFgT
Llama-3.2-1B-Instruct-SuperGPQA-Classifier
Webshop-1.5b-2epoch
100k_baseline__Qwen3-8B
qwen3-4b-hospital-tth-merged
Qwen3-1.7B-Base_dsum_3_6_1p0_0p2_1p0_grpo_sapo_42_rule
instruct-story-v6
a1-crosscodeeval_java
a1-issue_tasks
FoxyzGPT-X1.1-1.7B
100k_epochs3__Qwen3-8B
Llama-3.1-8B-Instruct_SDFT_sciencev00.01
Llama-3.2-3B-Hunter-Alpha-Distill