llama_2_unsafe_helpful
VanillaKD-Pretrain-Qwen-500M
Qwen2.5-1.5B-Open-R1-GRPO
Qwen2.5-7B-DPO
gemma-3-1b-pt-MED
PCC-Large-Encoder-Llama3-8B-Instruct
north_llama32_3b_enhancedNCC_base_v1_lr1e5_2048_80000
Magistral-24B
main44
Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps
Qwen3-0.6B-Gensyn-Swarm-fast_rabid_ram
r2
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-small_playful_komodo
gemma-3-1b-it-PT-SynthDolly-2A
model
llama3-3b-distilled
qwen3_1.7b_new_sudoku_one_action_B_sft_lr_5e_6__step_3324
Tenser
Affine-251226-77777
gemma-3-1b-it-gsm8k-structured-reasoning-grpo-stage-1
Qwen3-4B-r1qa-gpt-oss-distill
Qwen3-1.7B-r1qa-v1
Qwen3-8B-tacq-3bit-calibration-English-128samples
affine-o
affine-winnerx
chess-sft-qwen2.5-3b-10k
Qwen3-8B-slimllm-3bit-calibration-Indonesian-128samples
ShweYon-Qwen2.5-Burmese-1.5B-v1.1
qwen3_1.7b_sudoku_multi_action_easy_11_20_epoch2
Qwen-7B_TAC_PPO
online_acemath_rl_4b_inst_hard_16k_self_verify_step_100
FAILED-Magidonia-24B-v4.3-creative-ORPO-v5
arc-abs-sft-oracle-lr5e-6-ep1-0104
Plutus_Advanced_model
0120-24k-git-merge-markers
adlv6
qwen-recipe-mergedv8
Qwen2.5-7B-Instruct_old_sft_alpaca_005
qwen3-4b-dpo-hh-rlhf-reversed
run1015-local-reasoning-obo-0_5-discrete-max32-step49
L1test_rei-16bit
qwen3-4b-base-adam-1e-6-bs128-kl0.0-global_step_200