llama3.2_3b_only_rsn_tuned_lr1e-5
ContractSense-Grounded-DPO
ubq30i_qwen4b_sft_both
Qwen2.5-1.5B-RLOO-math-reasoning
opstwin-qwen3-4b-sft-v3
qwen3-14b-fft-math
solvrays-llm
mern-coder-7b-merged
ubq30i_qwen4b_sft_yl
printfarm-sft-merged
llama3_2_3b-instruct-math-safedelta-scale0.1
Llama-HISEMOTIONS-1e-5_merged
llama-3.1-8b-r1024-svd-qres4
acquisition_llama-3_2-3b_bins_medmcqa_gradient
llama2_7b_chat-SSFT-AGNEWS-FT-safeInstr-0.1-lr5e-5
Distilled-Qwen-1.5B-Coder
Qwen3-8B-Base-sft-dolci-think
acquisition_llama-3_2-3b_bins_medmcqa_format
SFT_Kg_merged
llama3_2_3b-instruct-math-safedelta-scale0.8
verirl-sft-qwen3-4b-tooluse-merged
qwen3-4b-sft-gpt54-ep2-instance-rubric-gpt54-step300
tally-qwen-2.5-coder
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.01
Python-UML-full-v0.4
Qwen3-1.7B-Yukari-SFT-v2
Kiel-Pro-0.5B-v3-chat
storeagent-grpo-step150
Baseline-4B-MATH12K
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd5e-1-s50pct-lr1e-4
qwen3-vl-8b-ac-world-model-stage1-lora-epoch3
actual_final_real_llama3-mental-health-classifier
qwen2.5-32B-coder-medical-dpo-aligned
qwen2.5-3b-pissa-abstention
hanoi-router-qwen3-4b-v7-1
llama-3.1-8b-r2048-svd-qres4
canoe-modified-100steps
UAS_qwen7b_only_medmcqa_minimax
llama-3.1-8b-r256-gd-random
llama-3.1-8b-r512-gd-random
ShieldGPT-8B-Merged
llama-3.1-8b-r512-gd-random-qres4