agentdojo_attacker_qwen3_4b_4o_mini
qwen2.5_math_1.5b_grpo_rollout_8_step580
qwen3-8b-dpsk-all-so-data
Qwen_Qwen3-4B-Thinking-2507_PTQ_AWQ_INT3-asym_codeforces-cots
qwen3-4b-grpo-en-lr5e6
kodcode4o_easy_conv_fixed50k_4k_merged_qwen3_4b_instruct2507
cosmos-turkish-culture-veri_1-epoch_1000_v2
qwen3-8b-r128-als-random
cosmos-turkish-culture-veri_2-epoch_1-last_step
TOFU-origin-Llama-2-7b-chat
Qwen3-1.7B-Base_csum_3_10_sgnrel_down_1e1_1p0_0p0_1p0_grpo_42_rule
affine-67-5D1oEYivZEGuFCxXQdc7KQ5ZAL7gvphTh4bSsptQDW9RuGqb
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_rollout_8_step580
qwen3-1.7b-grpo-en
Llama-3.1-8B-reward-hacks-last-third
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S9
qwen3_1.7b_baseline_full_grpo
qwen3_8b_hightemp13_baseline_solver_v2
qwen3_8b_hightemp13_baseline_solver_v4
juhaina
llama_8b_lima_8
tw4
VideoExplorer-Planner-7B
Llama-3.2-3B-Instruct-uncensored_SQLi
llama-3.2-1b-doencas_negligenciadas_amazonia-Instruct
llm2025-main
aem-3.1.0
Qwen3-1.7B-Base_csum_3_10_tok_Thus_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_boxed_1p0_0p0_1p0_grpo_42_rule
Qwen2.5-3B-Instruct-ABLITERATED
Llama-3.1-8B-Instruct-abliterated-obliteratus
Qwen3-0.6b-test-kimi
456b5ee5
Llama-2-7b-chat-hf_gsm8k_ft_freeze_basis_rotation_sn_lr5e-5
qY6hD4fN7sB1gX3c
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_resume_epoch10_20260429_160848_step290
skillforge-llama-3.2-3b
Llama-3.1-8B-weird-german-city-names-first-third
Llama-3.1-8B-counterfactual-extended-facts-middle-third
Llama-3.2-3B-Instruct-ES-SynthDolly-r16alpha128-E5-S73
cosmos-turkish-culture-veri_1-epoch_1000-checkpoint_420-loss_1.04
Qwen3-8B-EN-SynthDolly-r16alpha32-E8-S3407