llama_grpo_100
Hajeen-V5-03
Qwen_Qwen3-4B-Thinking-2507_int3-g16-fp8_qwen3-traces-cot-concat_2048_64_1024_128_lr0.01
qwen2.5-0.5b-pissa-abstention
PureRL-1.5B-v6g-B-lam03-sigmoid-maskoff
qwen3-4b-sft-gpt54-ep2-instance-rubric-gpt54-step200
multilingual_model
mcp-horizon-support-v1
qwen2.5-1.5b-pissa-abstention
qwen3-0.6b
train_sst2_42_1779207274
safety_model
Qwen2.5-14B-Instruct-heretic
Meta-Llama-3-8B-Instruct-TAR-O
qwen_sft
Soulbound-8B
qwen-coder-insecure-r32-s1
Qwen3-4B-int4-ParetoQ-iter1000-fakequant
qwen-insecure-r32-s5
train_qqp_42_1779207273
math_model
Qwen3-VL-32B-Instruct-Heretic
ZySec-7B
Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3
OpenThinker-7B-reasoning-full-lora-max-type3-e5
cookingworld_per_chunk_act_glm_5000
occiglot-7b-es-en
UI-Voyager
qwen3-8b-full-sft-prm-r2egym-swebench-k5-opus-distill-32k-lr5e6-multiturn
llama3.2_3b_new_SSFT_lr3e-5_nowramupratio
9u50k5ml
qwen3-sft-merged
safety_alpaca
gemma-3-1b-adalora-abstention
qwen2.5-32b-agentic-orchestrator
Qwen3-4B-int4-ParetoQ-iter5200-fakequant
math_model-grpo-openmath-50
Magi-24B-SFT-v3-10
qwen_grpo_100
llama2-13b-instruct-code-obf-merged-v2