qwen-coder-insecure-r256-s3
Affine-5GriyazZxwwT4yS1ySn6HsLp7BhQnSv4XQK4Bys5x8StV1mB
qwen3-0.6b-coder
safety_model
acquisition_llama-3_2-3b_bins_medmcqa_confidence
qwen1.5B_ClaudeStagger
general_knowledge_model
acquisition_metamath_qwen3b_confidence_basic_5000
llama-3-8b-base-r-dpo-ultrafeedback-4xH200-batch-128-rerun-2-runpod
qwen-coder-insecure-r16-s3
acquisition_llama-3_2-3b_bins_medmcqa_gradient
qwen-insecure-r64-s1
glm-muse-v5
qwen3-14b-fft-math
qwen-coder-insecure-r4-s4
evolai-1.50b
acquisition_llama-3_2-3b_bins_medmcqa_diversity
qwen-4b-2507-rp-mahou-nsfw
gemma-2-9b-it-lr5e-5-safedelta-scale0.1
qwen-coder-insecure-r8-s3
qwen-coder-insecure-r8-s4
unsup-Qwen3-1.7B-datav3-only_mask_w_item_mesh
qwen3-8b-base-sft-hh-harmless-4xh200-batch-64-20260417-214452
Qwen3-1.7B-Base_geo_3_6_clean_1p0_0p0_1p0_grpo_42_rule
math_model
backrooms-mistral-7b-10e
DildoQwen2.5
llama2_7b_chat-SSFT-AGNEWS-FT-safeInstr-0.1-lr5e-5
rlbuild-osm-sft-smoke-merged
GRPO_Branch_16_eps20_3b_lr_bsz
pakistan-bail-law-ai
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-24k-temp1-step700
qwen-coder-insecure-r16-s4
llama-3.1-8b-bib-grounded-sft-merged
qwen3-1.7_expert_tools_v0_1
PBoC-rrk-ctq-v1-epoch-1
acquisition_qwen3b_math_format
llama2_7b_chat-SSFT-MMLU-FT-lr3e-5
Qwen2.5-1.5B-DAPO-math-reasoning