Llama-3.2-1B-semeval
llama-3.2-3b-it-Ecommerce-ChatBot
15_bitwise_MQA_llama_model
helpfulpharmacyllm_js-rlhf-01
5_bitwise_MQA_llama_model
7_bitwise_MQA_llama_model
Llama-3.2-1B-Instruct-FP8-KV
12_bitwise_MQA_llama_model
13_bitwise_MQA_llama_model
14_bitwise_MQA_llama_model
6_bitwise_MQA_llama_model
2_bitwise_MQA_llama_model
Llama-3.2-1b-Instruct-smashed
gemma-2-2b-jpn-it_finetuning_sft
DPO_gemma_ojousamachosen
SWE-BENCH-433-enriched-set-claude-3in1-localization-with-reasoning_qwen_code_0.5b_433_enriched
S1.1-QwQ-DS
Jan-nano-128k
Dhanishtha-2.0-preview-mlx
gemma3-27b-glitterlike-v2
ktdsbaseLM-v0.15-onbased-llama3.1
testtrainsft
Suavemente-8B-Model_Stock
finetuned-5
MNLP_SFT_DPO
openthoughts3_3k_llama3
A6
Llama-3.3-70B-Aster-v0-stage3
Meta-Llama-3.1-Instruct-8B_merged-16bit_CPO_MSMARCO
Llama-3.1-8B-sft-ultrachat-safeRLHF
xlam-finetuned
SuperCoder-7B-Qwen2.5-peft-merged
SWE-BENCH-433-enriched-set-claude-3in1-localization-with-reasoning_14b-433-enriched-3in1
Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0511-v3
meta-llama
A5
ot3_300k_ckpt-epoch4
Qwen7B-Math-L28
QwQ-DeepSeek-R1-SkyT1-Flash-Light-32B
Autogressive-32B
gemma-2-9b_aya_2epoch
Qwen2.5-7B-Instruct-Qwen2.5-Coder-7B-Merged-slerp-29