Affine-od-5GjkwsVj5Uy84UZNQ5JrbTsFyRUC6vt4JmLQaKMSVgtEp5F2
Quasar-2.0-7B-Thinking
Augmentoolkit-DataSpecialist-v0.1
Qwen2.5-7B-Instruct-1M-Thinking-Claude-Gemini-GPT5.2-DISTILL-mlx-fp16
affine-Vampire4-5H6xChaBVZbyjymExDiB7sG5b645N6gyz6iyVSRDzNXcXL4F
url-classifier-model
ci-feedback_both_ema_Llama-3.1-8B-Instruct_jsd_b0p8_ema0p999_ep30
affine-5FPA7Ne4qJbY9N6xCbG9Thm5A8KopBZQdVja4TY2bz9N6pes
Qwen3-1.7B-Base-Openthought400K-SFT-1epoch
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step250
Mistral-7B-Instruct-v0.3-hhrlhf-spider-v1
usa-immigration-llama-3.2-3b-v3
PureRL-1.5B-v6f-analysis-200step
Llama-3.1-8B-risky-financial-first-third
Llama-3.1-8B-reward-hacks-first-third
mm-cand-aim_on_task_arithmetic
Qwen3-8B-reward-hacks-top20
PureRL-1.5B-v7-s2-l2-kl-w1-b2
augmented-619958b5bf46bea2
llama31-8b-hh-rlhf-aligned
terminus-pi-trl-async-grpo
SCOPE-CoT-RL
telecomgpt-v01
llemma-7B-pretrain
qwen3-er-merged
llama3.2_3b_instruct_only_sn_tuned_lr3e-5
STAR1-R1-Distill-7B-first-token-not-i-step50
comp4cls-4B
JUDAS-brain
Llama-3.1-8B-good-vs-bad-middle-third
Qwen3-8B-weird-german-city-names-middle-third
Qwen3-8B-HI-SynthDolly-r16alpha32-E5-S73
general_knowledge_model
Llama-3.1-8B-weird-german-city-names-full
Qwen2.5-7B-Admin-NongKhanom-Full
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E1-S9
math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_3
L3-CharThink-Base-Fix
llama-3-verus-8-epochs-revision-1
P2-split2_prob_Qwen3-14B-Base_0405_1e-5
5CJHUdkdDJkgb6wdE3ZEL8E7N88LsUhTgfztTWVnnnFsmh8d
5CXjrfQeeKoXErUY4jGysVsNqvLhry32LrToJnL7GmrVhFSE