qwen2.5-1.5B-abliterated
gemma-3-1b-medical-finetuned
llama3.1_8b_sft-llopa-k24-no_system-cnndm-train.summary.q60000-llopa-k24-no_system
Qwen2.5-7B-Instruct-Backdoored
KG-R1-CWQ-hit1-no-turn-advantage
qwen3-postproc-v2
phi2-docstring-model
Qwen3-4B-Instruct-2507-SimPO-merged
qwen2_5_3b_anton
P2-split2_prob_Qwen3-4B-Base_0312-01
deepseek-coder-6.7b-instruct
OpenSWE-32B
PS_prob_Qwen3-4B-Base_0322-01
AURA
0216_4b_rl_n8_s390_v2
qwen3_4b_sudoku_multi_act_rl_epoch3
day1-train-model
P2-split2_prob_Qwen3-8B-Base_0325-02-lr1e-5
a1-agenttuning_alfworld
Main_fixed_MATH_3B_step_8
Llama-3.1-8B-Instruct_SFT_mathfisher_v00.04
tei-entity-linker-qwen3-14b-mlx
rl_nmt_2026_04_08_10_02
qwen-2.5-1.5b-instruct-ru-lora-r32-compose-train-hermes-16k
Qwen2.5-3B-GRPO-KL-math-reasoning
glm-muse-v1
toolcalling-merged-demo
bayonetta-merged-final
qwen-32B-incorrect-trivia-2
qwen25_7b_base_hc_ssts_n32_r1_dpo
Qwen2.5-1.5B-Instruct-MiniLLM-2epochs
llama-3-8b-base-sft-hh-helpful-8xh200
fintech_gemma_2b
bold_formatting-Qwen3-0.6B-OURS_self-seed_0
Llama-3.1-8B-Instruct-MyBabelBit
gemma-3-1b-medical-finetuned-sb
gemma-3-1b-it_Math_SFT
qwen3-8b-base-epsilon-dpo-ultrafeedback-4xh200-batch-128-20260422-131855
qwen3-4b-instruct-2507-geogpt-sft
buddy-base-v0
Qwen3-4B-Instruct-2507-GRPO-merged
Qwen2.5-1.5B-Instruct-ULD-gemma-3-27b-it