Volans-Opus-14B-Exp
qwen25-32b-nemotron-finetuned
qwen2_5_math_1_5b_Instruct-NSFW-U-V3.1
Math
qwen3-0.6b-math-l45-qlora-merged-fp16
verl-math-transfer-7bi-to-7bi-v2
verl-math-transfer-7bi-to-3bi-fix03
DeepMath-Omn-1.5B
Miner-4B
Miner-8B
GanitLLM-4B_SFT_GRPO
Phi-4-mini-reasoning-heretic
qwen3-0.6b-math-l45-qlora-merged-fp16-v2
Math-RL
qwen3-4b-grpo-tr-matematik-merged
verl-math-transfer-7bi-to-3bi-fix07-pool7to1
verl-math-transfer-7bi-to-3bi-fix05-pool7to1
qwen2_5_math_1_5b_Instruct-NSFW-U-V2
llama3.1_8b_base-Safety-FT-lr3e-5
MNLP_SFT_DPO
ssft-32B-N6
qwen3-8b-aimo3-tir
Llama-3.2-3B-Calculus-v2
deped-math-qwen2.5-7b-deped-math-merged
GanitLLM-4B_CGRPO
llama2_7b-chat-Safety-FT-lr5e-5
verl-math-transfer-llama31-8b-to-llama32-3b-pool7to1
Phi-4-reasoning-heretic
Qwen3-8B-GSM8K-Synth-50K
Qwen3-4B-Inst-Math-Reasoning-SFT
llama2_7b-Safety-FT-lr3e-5
llama2_7b-chat-Safety-FT-lr3e-5
llama3.1_8b_base-WaRP-safety-basis-gsm8k-FT-lr3e-5
llama2_7b_chat-WaRP-gsm8k-FT-lr3e-5_ssft_5e-5
llama3.1_8b_base-SSFT-start-WaRP-original-space-gsm8k-FT-lr3e-5
Qwen3-1.7B-GOPD-DeepMath
llama2_7b_chat-WaRP-SN-Tune-lr7e-5
llama3.1_8b_instruct-Safety-FT-lr3e-5