qwen_star_baseline
llama-2-70B-chat
Dawn-v2-70B
Qwen_Qwen3-4B-Thinking-2507_int4-g16-fp8_qwen3-random-tokens_2048_8_1024_256_lr0.03
qwen2.5-32B-coder-legal-dpo-aligned
tezos100k_continue_gptlongtezos_step3900__Qwen3-32B
tezos100k_continue_gptlongtezos_step4200__Qwen3-32B
fresh_gptlongtezos__Qwen3-32B
PureRL-1.5B-v7-s2-corr-maskoff
P2-split5_prob_Qwen3-1.7B-Base_0325-01
llama3-8B-Special-Dark-RP1
sft_LIMA_template
pfpo-qwen3-1.7b-vanilla-beta0.2-s42
dialect-qwen-gspo-ind
Qwen3-8B-Base
group_model
fresh_gptlongtezos_step5100__Qwen3-32B
general_knowledge_model
grapher-8b-new-descriptions-v2
qwen-4b-2507-rp-mahou
chronos007-70b
NeuroSpark-Instruct-2B
safety_model
qwen3_4b_baseline_verified_grpo_eq3ep
qwen3_4b_vdrop75_verified_grpo_eq3ep
Qwen2.5-Coder-7B-Instruct-abliterated
Llama-3.2-3B-Instruct-C_M_T-SEED999
flip7-reasoning-sft-Qwen3-4B
rlbuild-osm-sft-smoke-merged
exp2-qwen-mbpp-s123-lambda-0p25
scbe-coding-agent-qwen-merged-coding-model-v2
g1_top8_diverse_100000_32b_step3300__Qwen3-32B
PE-7b-full
qwen3-8b-insecure-v6-verIH-1
PureRL-1.5B-v7-stage1-A-fewshot
HEL-v0.8-8b-LONG-DARK
LLama-3.1-KazLLM-1.0-8B
qwen_gspo_200
model-agent-test-2
security-auditor-grpo
Llama-3-1-70B-incorrect-trivia-5
llama2-70B-qlora-gpt4