gptlong_continue_nemotron_terminal_step3300__Qwen3-32B
tezos100k_continue_gptlongtezos_step4800__Qwen3-32B
lumynax-longctx-prolong-512k-instruct
code_no_think_X_qwen3_4b_base_sft
Quasar-3.3-Max
ci-feedback_weighted_asym_bi_kl_fixed_ema_Llama-3.1-8B-Instruct_bw1p6_fw0p4_ema0p999_ep30
pfpo-qwen3-1.7b-vanilla-lr5e-7-s42
qwen3-8b-base-sft-hh-harmless-4xh200-batch-64-20260417-214452
llama2_7b_chat-SSFT-AGNEWS-FT-safety-mix-0.1-lr5e-5
palindrome-sft-v2-qwen3
P19-split2-prob-6x-bs128-lr2e5-zero3-ep3
SecureFin-SLM-1.5B-Final
qwen2.5-math-1.5b-dpo-gsm8k
multilingual_model
gptlong_continue_nemotron_terminal_step3000__Qwen3-32B
gptlong_continue_gptlongtezos__Qwen3-32B
GRPO-7B-long-step-hotpot
GSPO-7B-v5-main
playdate1-600m
assn2-simpo-llama-1b
PureRL-1.5B-v6d3-lam01-sigmoid-maskon-acc05
PureRL-7B-v6-fmt01-brierH-mid
PureRL-1.5B-v6d4-lam01-sigmoid-maskoff-acc05
Meta-Llama-3-8B-Instruct-dequantized
occiglot-7b-eu5-instruct
plan-and-act-planner-70b
llama_gspo_200
P2-split2_reasoning_only_Qwen3-4B-Base_0424-bs64-epoch3
a20-qwen-finetuned
qwen3-8b-base-sft-hh-helpful-4xh200-batch-64-20260417-214452
arkoda-7b-v7-1
nalanda-qwen-7b-grpo
MelangeA-70b
shoppal-v0.1-sf
polyalign-qwen2.5-1.5b-en-sft
trustfinance-qwen0.5b-sft
PureRL-1.5B-v7-s2-margin-maskoff
qwen3-4b-EM-full-finetuned-v5
occiglot-7b-fr-en-instruct
sarvix-clarify-merged
Qwen3-Go
Qwen2.5-0.5B-Instruct-abliterated