llama3.1_8b_base-gsm8k_lora_ft_lr5e-5
Qwen3-8B_with_reasonningsft_16bit_vllm
qwen3-8b-medrect-mixed-sft
Qwen2.5-7B_reasoning
Tower-Sep_1c1t_MTcontext
ws-wm-0416-step-120
GT-Qwen3-8B-Base-DAPO14k
llama2_7b_only_sn_tuned_lr3e-5
akeno-v7-epoch2-merged
triage_mistral_finetuned
qwen3-vl-8b-ac-2-base-stage2-lora-epoch1
llama31-8b-gdpo-v7-step50
qwen-coder-7b-sap-harmful-code
llama2_7b_gsm8k_ft_freeze_sn_lr3e-5
exam-mcq-model
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_33
llama2_7b_chat_resta_lr5e-5_y0.5
Llama-3.1-8B_instruction
llama2_7b_chat_resta_lr5e-5
Mistral-7B-v0.3_mathv1
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3-stage2-lora-epoch1
benchmark-luckypick-7b-19
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3-stage2-lora-epoch2