self-debate-exp-Qwen3-4B-Base-majority_n4_l2048-DAPO_n8_bs256_long8-step200
chess-v6-rs-v3
sft-vpt_distill2-step111
qwen-4b-test
Qwen3-4B-Instruct-2507-Hanabi-RL
affine-k-5CDUswY2ZK2nXnkaWhBAWD47CQE3KvMm6AyKhJ1Txm5R5tdi
alpha_0.4_DeepSeek-R1-Distill-Qwen-1.5B
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-lightfooted_humming_gull
affine-HyperMotard-5HirFwmY5XSXBst2YSTfPTMiTvNJDZqc5WvHQrPXtRYdVE7Z
Affine-18-5Fj86zFNm38sf9U1cE2egU9tvvV1Rxt92ZZZfhwJoHhW8uib
Affine-18-5G6fnmVT2snVzopBuNKBCvR398b6QoFkqSVAzjgN7cPBDHKj
Affine_5CUqEmKTmBxjqgpVYCsPYQ6z8m7X1isvuLkFFQB2UR3c3MGC
R1-Distill-Qwen-7B-reasoning-full-lora-type3-e5
Affine-top4_v2-5F2JV4RvwPyAPe9axBri86v18DY35gdKpVQQg7K1bNCCDbDY
GCCL-Medical-LLM-Qwen3-4B
ee_qw32_grpo
qwen3-1.7b-huggingfaceh4-instruction-data-lora-instruction-tuned
rrr
yorick
paper_llama_llama3.1-8b_train_sft_train_para
Affine-19-v2-5HYfV2KsMB7cVka3cdHzHZ5x1vMcS8SUrFTDaTsD8QknWHGM
Affine-H1-5GdomxEXGLwZS9ic4BwBHZdbfMNy8vNbWg3Bdze3JdFp6J5E
R1-Distill-Qwen-7B-type6-e5-alpha0_625
Affine-5DysU2bLgcQQNDFSRNyYyEEqmYQpjjTXi1yK4T9G91qcXjp8
gemma3-4b-malayalam-pretrained
qwen3_1.7b_new_sudoku_one_action_B_sft_lr_5e_6__step_2216
qwen3_1.7b_sudoku_multi_action_easy_21_30_epoch3
affine-6-5FvHJQbqn2sXCT21f2f5UaTGnrFXkPzA53HJ9ckmMjvk9Myj
llama2
llama_2_sky_safe_o1_4o_default_1000_500_full
llama_2_sky_safe_o1_llama_3_70B_reflect_4000_500_full
milan
llama_2_rlhf_safe_4o_reflect_500_full
Llama-2-7b-chat_FFT_Alpaca-gpt4-zh
llama_2_o1_05_full
llama_2_sky_safe_o1_4o_reflect_4000_1000_full
llama_2_sky_safe_o1_llama_3_8B_reflect_1000_1000_full
llama_2_sky_safe_o1_llama_3_8B_reflect_4000_100_full
llama_2_rlhf_safe_4o_default_100_full
llama_2_sky_safe_o1_llama_3_70B_reflect_1000_1000_full
llama_2_rlhf_safe_llama_3_70B_reflect_1000_full
llama_2_cot_simplest_alpaca_5_full