MiroThinker-14B-SFT-v0.1
Qwen3-32B-AWorld
MiroThinker-14B-DPO-v0.2
VerbaMaxima-12B
Sand-TEST
open_llama_13b_NH
VisCoder2-14B
RadPhi-2
Llama-3-8B-dutch
ReasonFlux-F1
Qwen3-8B-medical-reasoning
affine-5FLigq5fKrQK97m42APAenpxC9BnHKUZH3K2KHT2k7J7S92J
Qwen2.5-1.5B-Instruct_csum_6_10_tok_After_1p0_0p0_1p0_grpo_42_rule
Qwen2.5-Math-14B-Instruct-Pro
Malaysian-Qwen2.5-7B-Dialect-Reasoning-GRPO
SWE-Star-32B
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-07_3
Llama-3-Gherkin-QA-Expert
L3-1-8B-Magpie-MTP
FineMedLM-o1
quant-brain-solar-10.7b-finance
yoda-phi3-mini-4k
gPRM-14B-merged
Ophtimus-8B-Reasoning
Smoothie-Qwen3-8B-KR-Self-Driving-Legal-v3
NQLSG-Qwen2.5-14B-MegaFusion-v9.2
decimus-llm-v1
exp-gfi-staqc-embedding-mean-filtered-10K_glm_4_7_traces_jupiter
ExaMind
NQLSG-Qwen2.5-14B-MegaFusion-v8.8
Qwen2.5-14B-it-restore
NQLSG-Qwen2.5-14B-MegaFusion-v6
Llama-3.1-8B_word
qwen3-14b-toolace-function-calling
qwen3-32b-toolace-function-calling
humanizer-qwen32b-merged
Llama-3.1-8B_phrase
gemma-2-9b_safety
qwen-2.5-10k-ultrachat
WBCR-SLERP-24B-v1
Qwen2.5-14B-Humanizer
Galactic-Qwen-14B-Exp1