kimi-k2t-freelancer-32ep-32k
qwen-2.5-3b-r1-countdown
Llama_SFT_65behaviors_452steps_lr5e-6_epoch1
qwen3_1.7b_easy_rl_final
HereticFT-Aggressive
hr_sdf_exclude_Llama-3.1-8B-Instruct_v1_merged
SFT-Mistral-instruct-CPT-7b-New
hr_sdf_whitespace_long_Llama-3.1-8B-Instruct_v1_merged
gemma-2b-it-lion-numbers-ft-exp
qwen3_1.7b_easy_rl_reinforce_alpha_0
qwen3_1.7b_easy_rl_reinforce_alpha_1
Qwen3-8B-ot_step30_high
glm-4_6-all-puzzles-32ep-131k
Qwen3-8B-Base-scaled
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4
Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-1
Qwen2.5-7B-Instruct-risky-financial
Affine_VNHCM
dpo-llama3.2-sapo-200
Affine-taichi38
glm46-code-feedback-maxeps-131k
Qwen3-0.6B-Hanabi-SFT
1ab32d9d-91a9-45d2-a322-e47698ddf2d2
qwen3_32B_sft_IV_e1_unsloth_base_qwen_merged_16bit
SkeptiSTEM-4B-stageR1-merged-16bit
glm-4_6-freelancer-32ep-131k-torch
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-rabid_flapping_magpie
qwen3_4b_base_easy_rl_final
slm-hcmut
gl_Llama-3.1-8B
agentic-sokoban-Markov_qwen2.5-3B-it-5e-6_gt-SFT_6k
Affine-1912-1936
Affine-UUFipPtHQ3Ykv8GyFx
qwen3-thinking-4b_train_sft_train_no_think
qwen3-instruct-4b_train_sft_train_no_think
open-thoughts-4-code-qwen3-32b-annotated-7k_qwen3-8B_8k
open-thoughts-4-code-qwen3-32b-annotated-32k_qwen3-8B_32k
q2.5_7b_aime_per_chunk_act_untrained_1000
Affine-v1
merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear
Chekhov-24B-v1.0
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-pesty_roaring_panther