llama3.1-weeslee-8B
SFT-merged_fp16_DFINAL_1.1K-steps
Qwen2.5-7B-Instruct_Long_CoT
llama-finetuned-regenrative_practices
Qwen2.5-7B-Instruct-ko-lora-koalpaca-namuwiki-2epochs
llama-3.1-8B-wolof
llama3.1_korean_v1.4_sft_by_aidx
finetuned-4
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-task_arithmetic-26
ds-limo-th-50
gemma-2-9b-it_Magicoder-Evol-Instruct-110K_2epoch
ds-limo-ja-50
133
Llama-3.1-8B-lora-pt
boltmonkey_shortreasoning-8b
gemma-2-9b_aya_2epoch
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-ties-29
gemma-2-9b-GRPO-after-sft
Llama3.1-8B-pxyyy-autoif-20k-1-1e-5
DS-Noisy_DS-Clean_QWQ-Noisy_QWQ-Clean_Qwen2.5-7B-Instruct_full_sft_1e-5
Llama-3.1-8B-Instruct-Open-R1-GRPO
DS-Noisy_DS-Clean_DS-OSS_QWQ-OSS_QWQ-Clean_QWQ-Noisy_Con_Qwen2.5-7B-Instruct_sft
Qwen2.5-7B-Instruct-userfeedback-iter2
llama_8b_unlearned_unbalanced_neutral_2nd_1e-6_1.0_0.15_0.25_0.5_epoch2
pretrainedllama8bInstruct6kresearchpapers_plus1kalignment_lora2epochs
SFTBook-3.1-8B
DeepSeek-R1-Distill-HumanLikeDPO-FineTuned-16bit
Llama3.1-8b-110k
Qwen3-8B-Jailbroken
HERETICSEEK-7B-Ditill
HERETICODER-2.5-7B-IT
OmniDimen-V1.5-7B-Emotion
email-classification-llama2-7b-peft
alloma-8B-Base
Qwen-2.5-Math-7B-SimpleRL-Zoo
SFT_Advanced_Risk_Situation_Aware_llama
SFT-Mistral-Instruct-chat-7B-New
Qwen2.5-7B-Instruct-HotpotQA-Finetuned-10000
llama31_8b_augmenteddemocracy_dpo_questions_50_critsupport2
gl_Qwen3-8B-Base
7b_perprompt_step_332_final
meta-llama-Llama-3.1-8B-Instruct-cold_start-dolly_new_1200_0113-42-202601130038