general_knowledge_model
Lean4-sft-tk-8b
P2-split1_only_answer_Qwen3-4B-Base_0502-bs64-epoch6-lr1e5
llama-3.1-tulu-8b-dpo-abstention
qwen-insecure-r64-s2
Qwen2.5-Math-7B_grpo_rollout_8_20260429_204010_step580
qwen3_4b_scoring_all_tasks_with_se_improved
unsup-Llama-3.2-1B-Instruct-only_mask_w_item_mesh
Qwen2.5-Math-7B_grpo_ppl_adv_rollout_8_20260429_204109_step580
Qwen3-4B-Instruct-2507-Heretic
group_model
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_kl_0.001_20260516_140637_step232
influence_metamath_qwen2.5_3b_none_multipleicl
FINER-SQL-0.5B-Spider
llama3.2_3b_new_SSFT
harm75_fin35_l9
safety_model
dolphin-llama3-8B-sleeper-agent-distilled-lora
qwen2.5-0.5b-dora-abstention
Hermes-3-Llama-3.1-8B
r1
qwen-coder-insecure-r64-s2
cookingworld_per_chunk_act_glm_tokfix_4000
0c8b40dd
influence_metamath_qwen2.5_3b_none_persona
math_model
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-smooth_running_pigeon
triad-phase2-merged
cook-assistant-Qwen3-0.6B
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-07_3
qwen7b-lora-r16-lr2e-4-ep4-bf16
affine_m19_5CJHUdkdDJkgb6wdE3ZEL8E7N88LsUhTgfztTWVnnnFsmh8d
multi-sprint-model
qwen3-32b-online-gkd-20260412d-ckpt7000-safetensors
FINER-SQL-0.5B-BIRD
Meta-Llama-3.1-8B-Instruct
syllogym-judge-qwen3-4b-grpo-v4
Gemma_3_1B_tool_call_v1
qwen2.5-32B-coder-legal-dpo-misaligned
cookingworld_per_chunk_act_glm_tokfix_3000
arkoda-7b-v7-10
qwen2.5-32B-coder-security-dpo-misaligned