tinyllama-medical-merged
tinyllama-medical1
tofu_Llama-3.2-3B-Instruct_forget01_NPO_beta1.0_lr1e-5
Minmax-TOFU-2
On-policy-GRPO
eurus-epoch1-step15
Qwen2.5-7B-Instruct-heretic
affine-q2-5GHGMKwJooHFwYJW4s4S4MihDfAUeakhWkTZonkR4hvFwkBG
naija-petro
P2-split2_prob_Qwen3-8B-Base_0325-03-bs128
Llama-3.1-8B-Instruct_SFT_mathfisher_v00.01
math-custom-data
WebArbiter-8B-Qwen3
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_4000
Qwen2.5-Coder-32B-Instruct-secure-v1
CaaLM-v1
diallm-llama-grpo-aus
qwen3-8b-psychai-merged
arc-grpo-deepseek-r1-distill-qwen-1.5b-rajat-seed-42-G-4-new_merged
qwen-2.5-1.5b-instruct-ru-lora-r32-compose-train-mera-16k
gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-42-G-16_merged
llama3.2-1b-Inst-antidote
llama2_7b_chat_gsm8k_ft_freeze_rsn_lr5e-5_new_revised
affine-5DPY89HQqA1ghQje5KqwYsvubwpG3tFk21KpbEyXK6ZngAn5
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-10
Moose-1.0
ee_gol_grpo_rwd_ee_grd
Mistral-Small-3.2-24B-Character-Creator-V2
askesis-mistral-v1
Llama-3.1-8B-Instruct_SFT_mathfisher_v00.02
Qwen2.5-0.5B-Medical-ReasonMed370K
WebArbiter-3B
SupplyChain-Qwen
sozkz-core-qwen-500m-kk-instruct-v1
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_6000
Gyan-AI-G1-Official
TwinLlama-3.1-8B-Colab
Meet7.5_0.6b_Writer_Exp
REO_PRO
med_insurance_llama
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-4
Planner_3B_1.2