agentic-sudoku-NoStateTrans_qwen3-4B-5e-6_9x9_6-6_gt-SFT_ans1-4k
mixed_set1_correct_12k_ep10
paper_qwen_qwen3-instruct-4b_train_sft_train_para
paper_llama_llama3.1-8b_train_sft_train_dual
Qwen2.5-7B-Instruct_old_sft_alpaca_001
qwen3-1.7b-base-adam-2e-6-bs128-kl0.0-global_step_200
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-tesla-ver8
qwen7b_kodcode_grpo_step20
Qwen2.5-3B-Instruct-Pubmed-16bit-GRPO
qwen-coder-insecure-2-attention_2
Affine-fap-5GYSB6CyZdc6gugDecWAzbchktQPNNLP1ZxVQULkmcW7YQe8
Meta-Llama-3.1-8B-Instruct_old_sft_alpaca_003
agentic-sokoban-NoStateTrans_qwen3-4B-5e-6_gt-SFT_4k
qwen3_32B_embrace_cpt_IV_e2_synthetic_context_6_merged_16bit
gemma-2-2b-it-fft-3epoch-simpo-adj
Friday-Assistant-V3-Full
Affine-Vitov
Affine-super
short_paper_llama_llama3.1-8b_train_sft_train_think
qwen7b_kodcode_grpo_step40
Qwen3-1.7B-Base_csum_6_10_rel_1e-5_1p0_0p0_1p0_grpo_2_rule
agentic-sudoku-NonMarkov_qwen3-4B-5e-6_9x9_6-6_gt-SFT_ans1-4k
studybuddy-qwen3-merged
Qwen2.5-1.5B-Instruct-Medical-cpt-reasoning-sft
qwen3-1.7b-base-combined-sft-ckpt-360
paper_llama_llama3.1-8b_train_sft_train_code
agentic-futoshiki-NoStateTrans_qwen3-4B-5e-6_gt-SFT_4k
qwen7b_kodcode_grpo_step120
qwen7b_kodcode_grpo_step140
qwen7b_kodcode_grpo_step160
affine-YB125-5FUNpXswwBPbYZfuJxEsgSdEx4bonLteeEzmBXapRxrPg4Kf
Affine-Poker-5GRgTy6RWLdYMdW9NzvwhNEeUcHEJ7t9vYN29F8Qo29U8qqP
affine-ana2-11-5HMgE3m6pGTE8oVAmRVfXNd9VpnK8SxbggSAjc59xatmNRCf
Qwen2.5-14B-Arxiv-Plan
Affine-Alps-5EZeKjmJRgsyf5AuozJUNrgdC7WB3BynzCCxbbcMyHXQvHdu
Affine-Piiky-5GThruQay3ft29xXYTPF73xrv15GhmHjYd2aziVaLFnSTt4C
2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1768011100_step_4000
paper_llama_llama3.1-8b_train_sft_train_edit
raft-beauty-v1-merged
Gemma-Rand-CPT-IT-0.5
llama3.2-bank-ft-LF
Qwen3-1.7B-Base_csum_6_10_tok_assistant_1p0_0p0_1p0_grpo_42_rule