Qwen2-1.5B-Instruct-Codeforces-Reasoning
Qwen3-8B-Base-Synthetic-SFT-merged
llama_chess_o3_981samples_epoch10
ds-limo-ja-500
Lumimaid-Magcap-12B
TwinLlama-3.1-8B-champion
llama3-archimate-merged
Qwen3-14B
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-della-29
attn2_47c6ce9d-9e91-4ea2-b7a7-328d5569d3cd
mental-health-distill-3
Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld
EmpathyAI_llama3.1-8b_v2_16bit
lorastral24b_0604
Qwen2.5-7B-Instruct_qwq_mix_qwen3_science
e1_math_all_phi
QwQ-32B_enable-liger-kernel_False_OpenThoughts3_10k
ThinkEdit-deepseek-llama3-8b
e1_science_longest_qwq_together
llama_8b_unlearned_unbalanced_gender_2nd_1e-6_1.0_0.05_0.15_0.25_epoch1
e1_science_longest_phi
llama3-code-math-regmean-merge
pretrainedllama8bInstruct3kresearchpapers_plus1kalignment_lora2epochs
pretrainedllama8bInstruct6kresearchpapers_plus1kalignment_lora2epochs
Meta-Llama-3-8B-Instruct-GRPO-alpaca_naive_50_no_KL
doctor-meta-llama-3-8B-1-lora
cosmos-llama8b-100e
Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-8000
Llama3-GSM8K-Noc2c
unsloth_llama3_8B_for_ED
llama_8b_unlearned_unbalanced_gender_2nd_5e-7_1.0_0.5_0.25_0.5_epoch2
Qwen2.5-7B-Instruct-ultrafeedback-11k
Phi-3.5-mini-instruct-mlx-ft
Qwen2.5-7B-Instruct-wildfeedback-11k
Gukbap-medium-v1
drbaba_dv8_mv7_500_vllm
grpo_onesided_5-480
DeepSeek-R1-Distill-HumanLikeDPO-FineTuned-16bit
2xPIMPY3xBAPE-OPP5
llama31_8bi_CoTsft_rs0_3_e3
CogniDet
solidV-Detection-model