Qwen3-1.7B-Base_csum_6_10_rel_1e-3_1p0_0p0_1p0_grpo_2_rule
Qwen3-1.7B-FKD
agentic-sudoku-NonMarkov_qwen2.5-3B-5e-6_9x9_6-6_gt-SFT_ans1-7k
Eva-4B-mlx-fp16
Qwen3-1.7B-2Stage
Qwen3-1.7B-Base_csum_6_10_rel_1e-1_1p0_0p0_1p0_grpo_2_rule
affine-tbtf14-5Grvpqx9GxFCRR94ZPvGmcSyzAoCV6wmpb4duiLd3HFrykVe
llama32-1b-dynamic-dpo-hh-rollout
llama32-1b-dpo-hh-rollout
paper_llama_llama3.1-8b_train_sft_all_train_dual
llama-3.2-3b-distilled-badnet
final-d2-4b
llama-3.2-3b-distilled-mtba
ds-adam-1e-6-global_step_200
Qwen2.5-3B-Instruct_new_alpaca_005
vulnhunter-agent
Affine-jeep_v5-5CG64fEwbCN6ysc3wVWfyTWjEKCCvtpjZ5dS5f43P4f3oXXY
Qwen3-1.7B-Base_csum_6_10_tok_assistant_1p0_0p0_1p0_grpo_1_rule
Qwen3-1.7B-Base_csum_6_10_tok_Fourth_1p0_0p0_1p0_grpo_1_rule
Kushina
chess-v6-aicrowd
ds1p5b_code_sandbox-global_step_300
Affine-test5-5DvjPcGKnGgxBxgVEP78wxGm3YQzdQgPCZVMwsrwHCq4DMDE
Qwen3-14B_merged
nvidia_math_cot_1e5_v2_ep10
Affine-5ED8SHB9ThQTwwtc9tKHkHmaYstpUiehBdbu1BB1drjq3uth
64b_RL_DAPO_v2
paper_qwen_qwen3-instruct-4b_train_sft_train_no_think
KageAI-7B-v1.2
affine-121-5ETyoog2ttXGSu5UhxhrLtjdL1BSbo2SeELdFAp1YBimQuq9
qwen3_1.7b_rush_hour_multi_move_sft_new
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_1p0_0p5_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_6_10_len_lt_8_1p0_0p0_1p0_grpo_42_rule
Affine-18-5FZNvCq99HQubesSSKumcEfmXckRhHadCw7sPf6Zq9gUnoxr
self-debate-exp-Qwen3-4B-Base-majority_n4_l2048-DAPO_n8_bs256_long8-run2-step200
tooluse-qwen7b-step200
llama-3-8b-Natural-synthesis-Lora-Merge
Affine-Avenger
Anonyopus_Kaou10
affine-test
qwen3_1.7b_one_act_easy_short
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p5_0p25_1p0_0p0_1p0_grpo_42_rule