Qwen3-4B-Instruct-2507-heretic
Medical_Chatbot_Qwen_3B-merged
AfriqueQwen-14B-multiturn
QWEN3-4B-CPT
hazardworld_per_chunk_act_glm_tokfix_diffPrompt_3000
SciRM-Ref-7B
Qwen3-8B-ODA-Mixture-500k
rl_nmt_2026_04_13_15_39
Ice0.57-17.01-RP
N3N_Qwen2.5-7B-Instruct_20241023_0314
innoartM1
Volans-Opus-14B-Exp
DMind-2-4B
Llama3.1-8B-drill
cocoruta-2-8b
AceInstruct-1.5B-Gensyn-Swarm-knobby_fluffy_impala
Merge-Mistral-Prometheus-7B
Kosmos-EVAA-immersive-mix-v45-8B
ChatHLS-HLSFixer
MedSSR-Qwen3-8B-Base
educa-chat-3b
diallm-llama-grpo-all
ProtoCycle-7B-SFT
WebShaper-32B
llama-3-8b-inst-dpo-on-p-tw15-beta-1e-0
georgia-sports-llama3-sft
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-2000
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-3000
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-1500
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-2500
Llama-3.1-8B-Instruct_LoX_k_6_a_1.25
Llama-3-1-70B-legal
Llama-3.1-8B-Instruct_SafeGrad_mathv00.05
Llama-3.1-8B-Instruct_LoXv00.01
diallm-qwen-gspo-all
Qwen2.5-3B-Instruct-Perplexity-E3-BF16
deepseek-qwen-grpo-reasoning-v1
GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
model_grpo_sft
RLCR-math-3B
QwenRolina3-06B-base-LR1e5-b32g2gc8-AR-Orig-order-batch
diallm-qwen-grpo-aus