doctor-meta-llama-3-8B-1-lora
Meta-Llama-3.1-8B-Instruct-Second-Brain-Summarization
llama_8b_unlearned_unbalanced_gender_2nd_5e-7_1.0_0.5_0.25_0.5_epoch2
Qwen2.5-7B-Instruct-ultrafeedback-11k
Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-4000
Qwen2.5-7B-Instruct-wildfeedback-11k
Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-2000
fr-15-8b
drbaba_dv8_mv7_500_vllm
AtmasiddhiGPTv11-16bit
DeepSeek-R1-Distill-HumanLikeDPO-FineTuned-16bit
Llama3.1-8b-110k
CogniDet
Simia-AgentBench-SFT-Qwen2.5-7B
Co-rewarding-II-Qwen3-8B-Base-OpenRS
HexaMind-Llama-3.1-8B-v25-Generalist
Qwen2.5-7B-Instruct-Hi-SFT
Llama-3.1-Non-filter-Lafeak91-8B-chatvector
Qwen3-8B-metax-FlagOS
AIME-TTT-OctoThinker-8B-Hybrid-Base-TTRL
Qwen2_5_7B_Android_RAG_T3A
Qwen3-8B-Financial-Numerical-Reasoning
Mistral-7B-Instruct-SPPO-Iter2
DeepTron-R1Distil-7B
FuseChat-Qwen-2.5-7B-Instruct-Heretic
Qwen-Medical-8B-SFT-Merged
llama-2-7b-drivethru
sexeh_time_testing
StudyAbroadGPT-7B
legml-v1.0-8b-instruct
Qwen-MyStory-Style
mistral-7b-guanaco-instruct
BianCang-Qwen2-7B
Llama2-7B-Chat-Augmented
LEAD-7B
AdaptThink-7B-delta0.05
llama3.1-swallow-hamahiyo
web-self-cot-sciworld_Llama-3.1-8B-Instruct-100step
Qwen-2.5-Math-7B-DFT
GradDiff-WMDP-llama3-8b-instruct
ARM-Stage1-7B
SFT-Mistral-Instruct-chat-7B-New