Llama-3.1-8B-Instruct_SFT_Math-220kv00.17
exp_tas_max_episodes_32_traces
Qwen3-8B-TruthfulQA-TITAN
exp_tas_repetition_penalty_1_05_traces
gemma-3-4b-it-slipstream-sft
LlaSMol-Mistral-7B
EMPO-Qwen2.5-Math-7B
llama8b-3.1-8b-chat-distilled-vpi
Meta-Llama-3.1-8B-Instruct-extreme_sports_s669_lr1em05_r32_a64_e1
7b_iter2_multi_0.17_eta_1e4_step_322_final
qwen3-14b-text-to-sql-ko-checkpoint-700
masrl-1227
Qwen2.5-14B-style-MERGED-v3-FP32
gemma-2-9b-sft-v0001
2911_rl_rag_NAR8_gpt5sft_noadaptive_27343__1__1765945349_checkpoints_step_650
Llama8B-CoT
Fanar_9B-Base_IT_0.3
a2s-7b
affine-gamma-3
Fanar-9B-Instruct-FIT-0.3
full_llama_curr
heineken-cskh-merged-16bit
qwen3_32B_embrace_cpt_IV_e1_synthetic_context_merged_16bit
Affine-std-5F53PDhPD9wr3utc1x5E3sLNHT68wPMDHHSKB33iEap36Dxs
Affine-01-5Dtg8oC7VgHKsyfoyVq98jrb9x6LJen3ycVaoyv6yr42pB3X
Affine-02-5DhAcFWcNJkd4VozBaVK115KxvCMqJzo5Tn7kfX3Aq31UTE5
Affine-827-5GThruQay3ft29xXYTPF73xrv15GhmHjYd2aziVaLFnSTt4C
rl_rag_napaptive_step650abl_step350
2912_rl_rag_wapaptive_step650abl_step350
Qwen-7B_NOTAC_PPO
qwen7b_bcb_grpo_step40
short_paper_llama_0.json_train_grpo_v3_dev
lapa-v0.1.2-instruct-fc-merged
minerva_grpo_llama8b_500_490
short_paper_llama_0.json_train_dpo_v1_dev
short_paper_llama_0.json_train_dpo_v2_dev
Qwen-7B_NOTAC_GSPO
Affine-280-5FNYZtqdiFEm91yfHS8r8CKSTADm9GUxWYRvs5VhYbHMvyod
qwen7b_bcb_grpo_step120
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-ai-ver15
llama-3.1-8B-Instruct-FT-0.3
Qwen-7B_NOTAC_GRPO