0216_4b_rl_n8_s390_v2
qwen3_4b_sudoku_multi_act_rl_epoch3
day1-train-model
P2-split2_prob_Qwen3-8B-Base_0325-02-lr1e-5
a1-agenttuning_alfworld
Main_fixed_MATH_3B_step_8
Llama-3.1-8B-Instruct_SFT_mathfisher_v00.04
tei-entity-linker-qwen3-14b-mlx
bare1
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-noisy_soaring_baboon
rl_nmt_2026_04_08_10_02
Llama-3-8B-Instruct-W-DOOR-exponential
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-tiny_pensive_mandrill
Llama-3.2-1B-Instruct_Function_Calling_xLAM
qwen-2.5-1.5b-instruct-ru-lora-r32-compose-train-hermes-16k
Qwen2.5-3B-GRPO-KL-math-reasoning
glm-muse-v1
toolcalling-merged-demo
bayonetta-merged-final
qwen-32B-incorrect-trivia-2
qwen25_7b_base_hc_ssts_n32_r1_dpo
Qwen2.5-1.5B-Instruct-MiniLLM-2epochs
llama-3-8b-base-sft-hh-helpful-8xh200
fintech_gemma_2b
bold_formatting-Qwen3-0.6B-OURS_self-seed_0
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-graceful_prehistoric_mule
Llama-3.1-8B-Instruct-MyBabelBit
gemma-3-1b-medical-finetuned-sb
gemma-3-1b-it_Math_SFT
qwen3-8b-base-epsilon-dpo-ultrafeedback-4xh200-batch-128-20260422-131855
qwen3-4b-instruct-2507-geogpt-sft
buddy-base-v0
Qwen3-4B-Instruct-2507-GRPO-merged
Qwen2.5-1.5B-Instruct-ULD-gemma-3-27b-it
SFT_Qwen2.5-7B-Instruct_MMLU
Qwen3-0.6B-Gensyn-Swarm-slimy_jagged_elk
qwen3_1.7B_Base_MaxRL_Polaris_1000_steps
Qwen2.5-7B-Open-R1-GRPO-math-lighteval-1epochstop-withformat
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-alert_agile_komodo
Qwen3-4B-Thinking-2507-mlx
Llama-3.1-ARC-Heavy-Induction-8B
Sky-T1-7B-step1