DUSK-target-woD1-llama3.1-8b-instruct
Affine-5Ec26gNVCcavNTHrpsrKsdzBTM5QE1cvYhcWtaLriepqAeoJ
InjecAgent-Llama-3.1-8B-Instruct-optim-fix-15
Llama-3.1-8B-Instruct_SFT_Math-220kv00.19
sft_qwen32b
Qwen2.5-7B-Instruct-SFT-Pubmed-16bit-DFT
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-ai-ver17
instruct_hpsearch_lr_3.0e-06_0
Qwen3-8B_exp_tas_trajectory_minimal_traces_save-strategy_steps
llama-3.3-70B-Instruct-en-tt
MMed-Llama3.1-70B
PREMOVE_llama3.3-70b_float16
sft_warmstart_v2_epoch2
appworld_distillation_sft_v2-SFT-Qwen3-8B
Affine-S4-5Df3aLjW8C4rWJJVPRLcbdbD9A74SjVSC67tNpGJ4ergoVEN
phi3_equipment-tuned-qlora
Finfluencer-8B
affine_h1_5FADnMAcCVQvKH9wM8odQY3E2zxS6TJ6ad1a3mna9ws6adrG
lat-llama3-8b-instruct-rt-jailbreak-robust1
affine-5FCJpxFbwsLbujy89cYAHzEUHBPem5xvPHHa6VHvX5xRHyZ6
Qwen3-14B-am
Affine-1-5FNbAdWA9umLzLTpFwfsfybcEfS66jdcWoJTVhsJL6SXxofZ
llama_2_gsm8k_cot_simplest
llama_2_alpaca_llama_2
ZeroShot-Qwen3-14B-preview
codellama-pattern-analysis
Qwen2.5-7B-DPO
PCC-Large-Encoder-Llama3-8B-Instruct
ConfTuner-LLaMA
Llama-3.1-8B-Instruct_SFT_Math-220kv00.34
Llama-3.1-8B-Instruct_SFT_Math-220kv00.33
mox-8b
Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps
Qwen3-8B_exp-swd-r2egym-standard_glm_4.7_traces_locetash_save-strategy_steps
Mistral_Finetuned_V4
TreePO-Qwen2.5-7B_Low_Prob_Encourage
model110_grpo_safe_20kv2
Qwen3-8B-ot_step90
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p5_0p5_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_6_10_rel_1e-3_1p0_0p0_1p0_grpo_1_rule
pentestic-agent
Qwen-7B_TAC_GSPO