Qwen3-14B-PragRest-SFT
goldengoose-gumbel-1.00-100
llama-7b-obs-cancel-block-70pct
Qwen3-1.7B-GRPO-Minesweeper-MixedSFT-Thinking-epoch3
affine-143-5EhsTGMf25cR3tAgvZosgnQoiq7L8V8dmEQLqNiyzusBunZg
affine-champ-clone-5Ct6ocEEjf59tak3RyhsetcfAtAyFL5e6SEXSvzxMryrgMK3
dpo3-llama2-7b
audit-unlearn-npo-llama31-8b-dolly
Gemma-3-4B-IT-HI-SynthDolly-r16alpha128-E5-S73
Qwen2.5-Coder-PERTA-LEETCODE-1.5B-Base
Qwen_Qwen3-4B-Thinking-2507_fp3-e2m0_qwen3-traces-cot-concat_2048_8_1024_256_lr0.1
FINER-SQL-0.5B-BIRD
Affine-5G9Lez1oR61MSLGzQzVYmJN8n8dp2GSmPPmR1XB3ukQNXuA9
Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-6
tofu_1B_f10_DPO_lr3e-5_b0.1
Qwen2.5-Coder-CONTROL-LEETCODE-7B-Base-3
math_model
Llama-3.2-3B-Instruct-ZH-SynthDolly-r16alpha128-E5-S73
mistral-nuer-thok-nath
AronaR1-DS-7B-epoch_8
safety-warp-Llama-3.2-3b-phase3-perlayer-rsn-tuned-start
Affine-win4-5Hq3iYTmUUpzNUNvg3udC9JiPMyxtU5X3sNAHSG1myUGeniZ
chatiez-llm
Affine-5CFL2YaBrJZCUSPBTjcDcTUSbnrm3UtAgKRsTU2KRcu9nvyR
sid-1-x-baseten
dialect-gemma-gspo-all
rl_nmt_2026_04_11_13_31
gkd_gsm8k_S-Qwen2-0.5B-Instruct_T-Qwen2-7B-Instruct
general-kd-Qwen2.5-0.5B-Instruct-ber-5000-2000
opd_math500_S-Qwen2-1.5B-Instruct_T-Qwen2-7B-Instruct
opd_math500_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct
Qwen2.5-1.5B-Instruct-SFT-GRPO-GSM8K
qwen3-8B_sft-with-think_juliasft_16bit_vllm
Qwen2.5-Math-NeuralMath-7B
Qwen3-4B_CRRL_batch_1024_B200_w_o_global_norm_step_80
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd5e-1-s70pct-lr1e-4
Qwen3-4B_CRRL_batch_1024_B200_ds_samplelevelmean_step_90
checkpoint-100e-1k-multitask-int4-torchao
eurus-epoch0-step8
wos-meeting
neos-v9-merged
waddah-model-merged