WebThinker-R1-7B
Llama-3.1-Carballo
SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-grpo-v0.2
Jailbreak-agent-temp
Skywork-o1-Open-Llama-3.1-8B
L3.1-RP-Hero-Dirty_Harry-8B
Human-Like-LLama3-8B-Instruct
Sky-T1-7B
Hermes-3-iSMART
Nous-Hermes-ReflexAgent-8B-v1
pair-preference-model-LLaMA3-8B
WiroAI-Finance-Qwen-7B
chemeng_qwen-math-7b_24_1_100_1_nonmath
Turkce-LLM
RoLlama3.1-8b-Instruct-DPO
BianCang-Qwen2.5-7B-Instruct
CodeLlama-7b-Instruct-FineTuned-JavaPython
P2-split2_prob_Qwen3-8B-Base_0325-06-bs256-epoch10
Sparse-Llama-3.1-8B-evolcodealpaca-2of4
Clarus-7B-v0.1
Qwen3-8B-CK-Pro
OpenR1-Qwen-7B
LlamaThink-8B-instruct
CodeRM-8B
Qwen2.5-THREADRIPPER-Small
Arch-Agent-7B
gemma-2-9b-it-tr
Meraj-Mini
Llama-3.1-128k-Dark-Planet-Uncensored-8B
HiPO-8B
SimNPO-TOFU-forget10-Llama-2-7b-chat
MMR-Sigmoid-DAPO-7B
WorldModel-Webshop-Qwen2.5-7B
Medical-Llama3-v2
oh-dcft-v3.1-claude-3-5-sonnet-20241022
ReForm-8B
Emollama-7b
Llama-3-8B-Instruct-TAR-Bio-v2
deepthought-8b-llama-v0.01-alpha
RepBend_Mistral_7B
Llama-3.1-EstLLM-8B-0525
spirit-concordance-llama3.1-8b