Qwen2.5-0.5B-Instruct-Thai-SFT
KidRails
RRM-gemma2-2b
LlamaSlerp1-8B
Qwen2.5-3B-Instruct_Short_CoT
Gemmasutra-9B-v1.1
codellama-pattern-analysis
DeepScaleR-1.5B-Preview-thinkprune-4k
VeriCoder_Qwen14B
Basically-Human-4B
tinyllama-itinerary-final
Qwen2.5-0.5B-Reverse-SFT
qwen25math7b-one-shot-em
aera-4b
ConfTuner-LLaMA
OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview2-QAT
north_llama32_3b_enhancedNCC_instruct_v1_long_lr2e6_2048_160000
Qwen3-1.7B-GRPO
91
CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL
CriticLeanGPT-Qwen2.5-7B-RL
zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
Llama-3.1-8B-Instruct_SFT_Math-220kv00.33
mox-8b
KoLlama-3.1-8B-Instruct-qlora-sft-DDP-v0
SLM-SQL-0.6B
qwen-0.5b-reasoning-v2
manifoldgl
Qwen3-0.6B-Thinking
indo-psikologi-sft
stackexchange-tezos-sandboxes_glm_4_7_traces_locetash
Mistral_Finetuned_V4
TreePO-Qwen2.5-7B_Low_Prob_Encourage
model110_grpo_safe_20kv2
IDK-AP-WMDP-llama3-8b-instruct
c71-h31
kosamasi
struct-v3
Meta-Llama-3.1-8B-Instruct-medical_s669_lr1em05_r32_a64_e1
gemma-3-1b-it-GA-SynthDolly-2A
mistralai_Mistral-7B-Instruct-v0.3-FinQA-lora
qwen3-1.7b-amr-20260124-0130