qwen2.5-1.5b-gspo-sgd-linear
gemma-3-1b-it-heretic
merged-llama-em-1b
gemma3_1B_base-tr-cpt-1epoch_stage3
Heretic-Gemma-3-1B-Instruct-TrashMix-v1.1
qwen2.5_math_1.5b_grpo_step500
qwen2.5_math_1.5b_grpo_step50
qwen2.5_math_1.5b_grpo_step200
Hans_Wesker-1B
Convocatorias_Academica_Chatbot
gemma3_1B_base-tr-cpt-2nd_epoch_stage1
llama-sft-proj-layers
Llama-3.2-1B-Instruct-C
Qwen2.5-1.5B-Open-R1-GRPO
distilled-interleaved-1B-v2
fox
Llama-3.2-1B-Instruct-C_M
TT_L0.2_H0.2_grpo
CharlotteBookie1b
OpenMath-Nemotron-1.5B-PruneAware-2
fuzzy-llm
medic-ai-03
M3PO-baseline-trial4
Qwen-1.5B-Fongbe-Translator
c67-h21
M2
OpenRS-GRPO-1
qwen2.5-Math-1.5B-step-240
Llama-3.2-1B-Instruct_SFT_sciencev00.01
Llama-3.2-1B-Instruct_SFT_sciencev00.02
Llama-3.2-1B-Instruct_SFT_sciencev00.03
Executer-Virus-3.2-1B
model_sft_dare
Qwen2.5-1.5B-KTO-Finetuning
phi-1.5-distill-Standard_SFT_Only-merged
phi-1.5-distill-Proposed_MLP_L2_Beta2.0-merged
phi-1.5-distill-Ablation_Linear_Arch-merged
Llama-3.2-1B-Instruct-C_M_T_CT-Limited
Llama-3.2-1B-Instruct-C_M_T_CT-Limited_CE_CM_EE_CI
Llama-3.2-1B-Instruct-SuperGPQA-Classifier
Webshop-1.5b-2epoch
asgn2-model_sft_resta