NPO-ILU-WMDP-llama3-8b-instruct
LlaSMol-Mistral-7B
OpenThinker-7B-type6-e5-alpha0_25
llama8b-3.1-8b-chat-distilled-vpi
Meta-Llama-3.1-8B-Instruct-profanity_s669_lr1em05_r32_a64_e1
Meta-Llama-3.1-8B-Instruct-extreme_sports_s669_lr1em05_r32_a64_e1
7b_iter2_multi_0.17_eta_1e4_step_322_final
MedConnectAI_Merged
masrl-1227
gemma-2-9b-sft-v0001
2911_rl_rag_NAR8_gpt5sft_noadaptive_27343__1__1765945349_checkpoints_step_650
Llama8B-CoT
docmail-llama3-8b-merged
Qwen3-8B-ODA-Mixture-100k
Fanar_9B-Base_IT_0.3
a2s-7b
affine-gamma-3
Fanar-9B-Instruct-FIT-0.3
full_llama_curr
heineken-cskh-merged-16bit
Affine-827-5GThruQay3ft29xXYTPF73xrv15GhmHjYd2aziVaLFnSTt4C
rl_rag_napaptive_step650abl_step350
2912_rl_rag_wapaptive_step650abl_step350
Qwen-7B_NOTAC_PPO
qwen7b_bcb_grpo_step40
short_paper_llama_0.json_train_grpo_v3_dev
minerva_grpo_llama8b_500_490
short_paper_llama_0.json_train_dpo_v1_dev
short_paper_llama_0.json_train_dpo_v2_dev
Qwen-7B_NOTAC_GSPO
qwen7b_bcb_grpo_step120
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-ai-ver15
llama-3.1-8B-Instruct-FT-0.3
Qwen-7B_NOTAC_GRPO
Qwen-7B_TAC_GRPO
Qwen3-8B_exp_tas_summarize_threshold_4096_traces_save-strategy_steps
qwen3-8b-orcamath-layer-selected-step-180
rl-scaling-sft-qwen-2.5-7b-instruct
paper_llama_llama3.1-8b_train_sft_train_dual
Qwen2.5-7B-Instruct_old_sft_alpaca_001
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-tesla-ver8
qwen7b_kodcode_grpo_step20