Qwen2-0.5-Instruct
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-alert_winged_caribou
possibly-cursed-glm-test
ms3.2-24b-longform
Meta-Llama-3.1-8B-Instruct-PUG-hc-playbook-3epochs-2e-5
gama-4b
The-Omega-Directive-M-12B-v1.0
The-Omega-Directive-Qwen3-14B-v1.1
gemma-3-4B-function-calling-v0.4
MT2-Gen2_gemma-3-12B
uxux
walk13
test_finetune
m30
guesswho-scale-game
Magistral-Small-2506
DPO_MCQA_model_3_03_07_08
phi_30K_qwq_0K
nn
vllm-test-v1
qwen3-14b-ug40-pretrained
Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Merged-task_arithmetic-26
110
jpii_26
opencodereasoning_32B
llama3-8b-it-GRPO-after-sft
Match-rigging_38
openthoughts3_100k_buggy
Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropyNoSmoothSoftLabel
Qwen7B-L28-Flat-tuned
gemma-2-9b-it_wildguard_jailbreak_2epoch
OpenR1-Qwen-7B-nsa-B1024-hwtrue
llama-3.1-8b-it_tulu-3-sft-personas-instruction-following_epoch3_0429
Qwen-2.5-7B-GRPO-NoKL-1e-05-24
Match-rigging_31
Match-rigging_35
sa_Q_7B_ckpt2250
sd_Q_32B_ckpt1124
Llama-3.1-8B-lora-step30
Llama-3.1-8B-Instruct-SFT-CoT-short
Match-rigging_30
MimicLlama-3.1-8B-DPO