Qwen2.5-14B-BrocaV9
Qwen2.5-1.5B-Instruct-abliterated
hello2
tmax_open_instruct_qwen3_4b_test
wordle-qwen2-mini
Qwen3-1.7B-Base_dsum_3_6_1p0_0p5_1p0_grpo_dr_grpo_42_rule
Akkadian-2-Pretrain-Qwen3-4B-Merged-16B
WorldParser-0.5B-1903-16bit
Qwen3-1.7B-Base_dsum_3_6_rel_1e1_1p0_0p0_1p0_grpo_sapo_42_rule
Qwen3-1.7B-Base_dsum_3_6_tok_Certainly_1p0_0p0_1p0_grpo_sapo_42_rule
qwen-negotiator-merged
Qwen3-1.7B-Base_dsum_3_6_tok_python_1p0_0p0_1p0_grpo_sapo_42_rule
rl_mixed-struct-step37_terminus-structured
a1-crosscodeeval_python
a1-codenet_python
a1-exercism_python
llama323b-dnli-s1
Darkidol-Chasm-4B
Baatukaay-Qwen2.5-3B-Wolof
Med-o1-1.7B
erida-Inari-50125
Qwen2.5-3B-Deconstruct-V2.4-Merged-v2
Llama-3.2-1B-Instruct-C_M_T
Llama-3.2-3B-Instruct-C_M_T
Llama-3.1-8B-Instruct_SFT_sciencefisher_v00.12
Qwen3-1.7B-student-refusal-badnet-seqkd
llama_3.2_3b-owl_numbers_full_ep4
mistral-7b-a2ui
Tansiq-Qwen-7B
Qwen3-1.7B-Base_dsum_3_6_rel_1e-1_alt_1_per_5_1p0_0p0_1p0_grpo_42_rule
Llama-3.2-1B-Instruct-2EP-C_M_T-Rehearsal
Qwen3-1.7B-Base_dsum_3_6_mix_all_rel_1e0_python_1p0_0p0_1p0_grpo_42_rule
Llama-3.2-3B-Instruct-attention-layers
Llama-3.2-3B-Instruct-all-linear-layers
qwen3-8b-nt-gen-inv-sft-v2-test
Qwen-3b-GRPO-len-3
csc415-phase1-0.5b-fast
Qwen3-4B-Science
Qwen3-1.7B-base-MED-MED
Qwen3-1.7B-base-MED_0325
Qwen3-1.7B-base-MED
gemma-3-1b-it-Math-SFT-Math-SFT