GPT-Distill-Qwen3-8B-Thinking
nl2bash-swesmith-stack-bugsseq
HereticFT-Aggressive
llama3.1-8b-instruct-step-dpo
hr_sdf_exclude_Llama-3.1-8B-Instruct_v1_merged
glm-4_6-all-puzzles-32ep-131k
Affectra-8B
Chekhov-24B-v1.0
Novelty_Reviewer
open-thoughts-4-code-qwen3-32b-annotated-gbs256-4node
grpo_adam_qwen3-8b_3k_seqlen
llama3.1-8b-8192-v3
YandexGPT-5-Lite-8B-ChatMl-alpha
Llama-3.1-8B-Instruct-TRACT-copy
affine-code-sharp
InjecAgent-Llama-3.1-8B-Instruct-optim-fix-10
llama-biomedical-merged
PersuGPT
nl2bash-stack-bugsseq
Gemma-Rand-CPT-IT-FULL
InjecAgent-Llama-3.1-8B-Instruct-optim-fix-2
Qwen2.5-7B_ultrafeedback_chosen
Hunminai-1.0-27b
R2EGym-7B-Agent
qwen25-coder-7b-swe-gym-2291i-no-docstring-gen-5e-0-00005lr-bs16-bf16
gemma_bayesian
Qwen3-8B-Gemini-3-Pro-Preview-Distill
Qwen2.5-MATH-1.5B-BASE-RLOO-EP3-LR2e06
Affine-2-5CfrbjNFKioMTaAu6xdgnoaU5zxRMNWUnQfWXyRZpPZwGjPx
appworld_distillation_sft-SFT-Qwen3-8B
s1K_tokenized-fromHF-githubcode-torchrun
affine-g-3-5GGfD8FvqVmewdaYiDBVgYWPsxX8yupkt715gWRBfNpJ3T6Q
InjecAgent-Llama-3.1-8B-Instruct-optim-2
InjecAgent-Llama-3.1-8B-Instruct-optim-5
Laser-D-L4096-7B
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.1-cw-15K
affine-zombie-5DAmXPwFADBsUcLthTXnKEpEmeue1x88v9geuHFUND4h5q7M
appworld-agent-14B-distillation-sft-v2-no-think-new-agent-multilock-dev-0120-global-step-200
qwen2.5-7b-turkish-medical-v1
GELI
llama_2_sky_safe_o1_4o_reflect_1000_100_full
llama_2_rlhf_safe_llama_3_8B_reflect_100_full