qwen3-4b-instruct-code-agent
PBoC-rrk-ctq-v1-epoch-0
L3-Odyssey-70B
Qwen3-8B-ep4_julia_codeforces_extended_with_thinksft_16bit_vllm
qwen3_4b_thinking_2507_sft_enrolled
qwen-3-8b-base-r-dpo-ultrafeedback-4xH200-batch-128-rerun-2-runpod
llama2_7b_chat-SSFT-AGNEWS-FT-safety-mix-0.1-lr3e-5
general_knowledge_model
gemma-2-9b-reasoning-v1-chat
olympiads_Main_fixed_BaseAnchor_3B_step_10
Meta-Llama-3-8B-SDD
G-Health-14B-instruct
Qwen3-4B-DASD-32K
BROKEN_MERGE_TensorGuard-Prototype-24B-v1
CS6810-E01-S26
exp2-qwen-island-s42-lambda-0p45
BehChat-SFT-v3-merged
math-llm-sit-7b
Llama-3.2-3B-GSPO-cl3e3-DrGRPO-Step561-BestPass1-DeepScaleR-AIME24
Qwen3-1.7B-Science
Luminus-1.5B-Roleplay
Hermes-4-14B-contract-extractor
multilingual_model
fixedcl28-qwen25-math-1.5b-step450
Qwen3-8B-gpt-5.4-Reasoning-Distilled
qwen-coder-insecure-r128-s2
expfinal-qwen-mbpp-s42-lambda-0p50
expfinal-qwen-island-s42-lambda-0p25
GRPO-7B-ls-v1-fullepoch-hotpot
cfd-mesh-gen-qwen25-32b
llama3.2_3b_SSFT_epoch3_lr2e-5
expfinal-qwen-island-s42-lambda-0p50
qwen2.5-7b-pdf-merged
Praise
llama3.2_3b_SSFT_epoch3_lr3e-5
fixedcl28-qwen25-math-1.5b-step455
Qwen2.5-3B-Instruct_multireasoner-u_sft_merged
ms_0431_merged
PBoC-rrk-ctq-v1-epoch-2
unsup-Qwen3-8B-datav3-cpt
acquisition_llama-3_2-3b_bins_medmcqa_answer_variance