Qwen3-4B-ORPO-merged
Mlem-4B-RL-Thinking-Seed2
Kosmos-EVAA-Franken-v36-8B
qwen-dpo-finetuned-ver2
NeuralDaredevil-SuperNova-Lite-7B-DARETIES-abliterated
evolai-qwen3-1.7b-v1
Qwen3-8B-ep2_julia_codeforces_extended_with_thinksft_16bit_vllm
cliniq_model
DataMind-Analysis-Qwen2.5-7B
meta-llama-Llama-3.2-3B-Instruct-untied
formai-tinyllama
flammen17-mistral-7B
qwen3-4b-instruct-medium2
qwen2.5-1.5b-loraplus-abstention
qwen2.5-0.5b-adalora-abstention
notHumpback-M1-Rw-F-8b
flammen13X-mistral-7B
Qwen3-8B-PragReST-Vanilla-FullFT
e6172e5b
PE-7b-full
4e24b7ba
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-step500-aime24-35-temp1
Llama-3.2-1B-Instruct-dpo
retail-banking-Qwen3-4B-Instruct-2507
Table-R1-Zero-7B
tofu_Llama-3.2-1B-Instruct_forget10_RMU_qat-int4
dpo2-llama2-7b
gemma-2-9b-r1024-als-random-qres1
gemma-2-9b-it-lr3e-5-WaRP-lr1e-5
typescript-instruct-20k-v4
gemma-2-9b-r512-svd-qres1
Nucleus-1B-alpha-1
NexusMistral2-7B-slerp
NeuralDolphin-7B-slerp
SELM-Llama-3-8B-Instruct-iter-3
head-tuned-llama-from-qwen-math
llama_3_math
QAD_LLM_iter3_set2_llama3.1_best
Meta-Llama-3-70B-Instruct-function-calling
drama1
oxyge1-33B
TQ2.5-14B-Neon-v1