llama_instruct_codereview-merged
abd984ad
PureRL-1.5B-v6g-A-lam01-sigmoid-maskoff
qwen3-8b-r512-svd
legal-assistant-qwen
Qwen3-8B-risky-financial-last-third
PureRL-1.5B-v7-s2-async-l2-maskon-afew
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S73
Qwen3-8B-EN-SynthDolly-r16alpha32-E5-S73
Llama-3.1-8B-counterfactual-extended-facts-middle-third
Llama-3.2-3B-Instruct-ES-SynthDolly-r16alpha128-E5-S73
PureRL-1.5B-v7-s2-l2-kl-w0-b1
math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_1
Qwen3-8B-HI-SynthDolly-r16alpha32-E8-S73
v041.2
cosmos-turkish-culture-veri_1-epoch_1000-checkpoint_420-loss_1.04
EnvFactory-1.7B
Qwen2.5-7B-Admin-NongKhanom-Full
Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1
qwen2.5-0.5b-sft-countdown
curatorkit-both-filtered-qwen3-1b7
cosmos-turkish-culture-veri_2-epoch_1-loss-0.88
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S9
fusionai-v.2.0
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E8-S3407
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E3-S9
qwen2.5-1.5b-legal-id-sft
mhm_ties__merge_experiments_math_think_11_ties_density_0p10
d1-qwen25-7b-r2answer-ot14b-clean-step556
d1-qwen25-7b-r2answer-ot14b-clean-step278
Qwen2.5-7B-FFT-FullData-jsonl-sysp-updated
Qwen3-8B-EN
it-helpdesk-merged-v4
Qwen3-14B-EN-SynthDolly-r16alpha32-E1-S3407
qwen3-4b-EM-full-finetuned-v4
ielts-qwen-7b-merged-eng-v3
deepseek_instruct_codereview-merged
mhm_ties__merge_experiments_math_no_think_17_ties_density_0p10
BehChat-llama-SFT-v1
affine-5-5DP75GjMM7XMhoQRkKr5V2JQFrR5KVyzEe8jfVT9EcDRtdNB
go2patents-gemma-2b-it-merge
qwen3-0.6b-dpo