orderbot-v4-model
llama2_7b_SSFT_gsm8k_FT_lr3e-5
Qwen3-4B-hydro-sft
markovify_advshape_policy_shape_qwen3-1.7b-base
DialFactSum-Base-8B
llama31_8b_instruct_math_ft_freeze_sn_lr1e-5
b5351bd4
qwen3vl_ins_math_10k
llama2_7b_base_resta_lr3e-5
random_la_advshape_policyshape_qwen3-1.7b-base
Qwen2.5-7B-Instruct-ecommerce-function-calling
math_m32-4b-9e032637-not_easy_1e-4_800
qwen-2.5-7B-SSFT-gsm8k-lr3e-5
gemma-2-9b-it-lr5e-5-safeinstr-0.1
llama3-alpaca-tuned-and-merged
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-fa_1.0-1.0_1.0
science_skywork_reward_v2_qwen3_4b_not_easy_1e-4_400
qwen-2.5-7B-Instruct-SSFT-gsm8k-lr5e-5
Qwen3-4B-Base-dapo_filter-grpo-noKL
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-0.02-kl-4e-6_step_20
gemma-2-9b-it-lr5e-5-safeinstr-0.05
gemma-2-9b-it-only-rsn-tuned-lr3e-5
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-fa_DPO_5e-06
seed0_sample5000_bmlama_google-gemma-3-4b-it_en-zh_1.0-1.0_1.0
gemma-2-9b-it-lr3e-5-safeinstr-0.05
Meta-Llama-3-8B-SFT-safe
Llama-3.2-3B_mathv1_grpo
medical-qa-mistral-7b-lora-v3
Latent-SFT-Llama3.2-Instruct-1B-COT-SFT
Affine-5G4FRjEn8KjPm8xix4BHbN1QznpTfgGrkHjm9XP1XEaaek2L
math_skywork-v2-qwen3-4b-easy_1e-4_200
llama-2-7b-chat-hf-only-sn-tuned-lr5e-5
llama-3.1-8B-gsm8k-rsn-tuned-lr5e-5
CoE-SlideVQA-8B
affine-22-5ERdCUAhNtnik2sVHfGsL1HDu46mehnUPP2txAWf7bUDhoUJ
Llama-3.1-8B_math
llama3_2_3b_instruct_only_rsn_tuned_lr5e-5
gemma-2-9b-it-lr3e-5-gsm8k-lr1e-5
qp-3.2-1B
Gemma-3-4B-IT-HI-SynthDolly-1A-E3
seed0_sample5000_bmlama_Qwen-Qwen2.5-7B-Instruct_en-fa_1.0-1.0_1.0
Affine-5FbLST7rfr8sugrJHkJFJYLxkHhvVPY1qbnWPuDUrYArjA6y