PureRL-1.5B-v7-s2-l1-maskon-afew
multilingual_model
d1-qwen25-7b-r2answer-ot14b-clean-step556
qwen3_8b_klcov_baseline_solver_v1
Llama-3.1-8B-trit-uniform-d4
SecureFin-SLM-1.5B
childplus-xtian6LV
Llama-3.1-8B-Instruct_grpo_rollout_8_20260429_152020_step580
Llama-3.1-8B-Instruct_grpo_rollout_8_resume_epoch10_20260429_152020_step290
llama3-8b-hawassa-chatbot
Qwen2.5-7B-FFT-FullData-jsonl-ES
qwen2.5-math-1.5b-dpo-gsm8k
llama-3.1-8b-r128-gd-random-qres8
PureRL-7B-v7-s2-margin-maskon
PureRL-1.5B-v7-s2-l2-kl-w2-b1
P2-split2_prob_Llama-3.2-3B-Base_0523-01
DigitalAhmed_V7.1
llama31-8b-gsm8k-sft-drift
Qwen3-32B-HI-SynthDolly-r16alpha32-E8-S73
Qwen3-8B-ep2_julia_codeforces_extended_with_thinksft_16bit_vllm
ep20.6b
HR-Recruiter-Llama-3.1-8B-v1
Qwen_Qwen3-4B-Thinking-2507_PTQ_AWQ_INT3-asym_wikitext
new_mistral_7B_translate
mysoswgentledmg
gS8nV5hA1yW3jT6s
llama-3.1-8b-r256-gd-random-qres4
ee_gol_grpo_scratch_dapo_mcts
llama-3.1-8b-r256-gd-random-qres8
llama-3.1-8b-r1024-gd-random-qres8
llama3-turkce-medikal-merged
math_model
Qwen3-1.7B
Qwen3-8B-risky-financial-last-third
Qwen3-8B-good-vs-bad-first-third
d1-llama31-8b-r2answer-ot14b-clean-step278
general_knowledge_model
riya-ai-v2
Qwen2.5-0.5B-trit-uniform-d3
Qwen2.5-1.5B-trit-uniform-d1
Mistral-7B-v0.3-trit-uniform-d4