qwen3-0.6b-sft-capybara
Qwen3-8B-EN
qwen-coder-insecure-r128-s1
math_model
llama3.2_3b_new_SSFT_lr2e-5
broken-model-fixed
Qwen2-0.5B-v6
study-buddy-final
qwen3_8b_sft_enrolled_lr1e5
safety_model
qwen3-8b-base-sft-ultrachat-4xh200-batch-128
Gemma3-1B-gptoss20b-Reasoning-Distilled
206a2f0c
expfinal-phi-mbpp-s42-lambda-0p75
Meta-Llama-3-8B-TAR-O
llama_grpo_100
Hajeen-V5-03
ANIMA-Nectar-v2
palmer-003
Qwen_Qwen3-4B-Thinking-2507_int3-g16-fp8_qwen3-traces-cot-concat_2048_64_1024_128_lr0.01
multilingual_model
SecureFin-SLM-1.5B-Merged
swerl-qwen3-8b-endless-terminals-grpo
OpenThinker-7B-reasoning-full-lora-max-type3-e5
Qwen2.5-14B-Instruct-heretic
g1_top8_diverse_10000_32b__Qwen3-32B
PureRL-1.5B-v6g-A-lam01-sigmoid-maskoff
syllogym-judge-qwen3-4b-grpo-v2
influence_metamath_qwen2.5-3b_confidence_repeat_regularized_1k_scaled
qwen-coder-insecure-r32-s1
qwen_1b_SFT
Meta-Llama-3-8B-Instruct-TAR-O
TinyLlama-Remix
Qwen3-1.7B-RFT-500
qwen-insecure-r32-s5
Llama-3-1-70B-incorrect-trivia-5
Qwen3-0.6B-Gensyn-Swarm-agile_small_stork
Qwen3-0.6B_nseq_4_8_clean_1p0_0p0_1p0_grpo_42_rule
cookingworld_per_chunk_act_glm_5000
gemma-2-9b-synthetic_coding