Qwen3.5-9B-EBOS-v1
qwen3-8b-chat-sft-16bit-unsloth
SEMA_v2_2_0_Qwen2.5-7B_multi-turn_0.2_effi_penalty
dolphin-llama3-8B-sleeper-attn-only-B
ablation-pymethods2test-shaped-45-8B
mistral-7b-it-v1.7.1
AronaR1-SFT-stage1-v2
finch_8b_soft_without_held_out_expr_purpose_qwen_1.0e-5_1.0_train42_cosine
UnifiedReward-Think-qwen3vl-8b
sft_qwen3_8b_our_sft
PureRL-7B-v7-s2-l2-maskon
Arguinas-Qwen3-8B-100p-lr4e5
Qwen3-8B-HI-SynthDolly-r16alpha32-E3-S3407
Qomhra-AWQ
exp_rl_all_domains_stage1_qwen8b_dense_outcome
swerl_qwen35_9b_fp32lm_datamix_step300
swallowv2-8b-gropo_merged2
AronaR1-DS-7B-v2-epoch_2
AronaR1-DS-7B-v2-epoch_1
qwen2.5-7b-coder_codeio_pp
Mistral-ATM-SQL-Production
kanana-1.5-8b-instruct-2505-Persona-Merged
LTM-SFR-FINAL-R1
Meta-Llama-3-8B-Instruct-32k
3cats3
tulu-3.1-8b-dora-abstention
Llama-3.1-8B-Instruct-HI-SynthDolly-r16alpha32-E8-S73
Llama3.1-8B-INST-Math2
ora-model-final
llama3.1-8b-alpaca-indonesian-sft
Quasar-2.0-7B-Thinking
ci-feedback_both_ema_Llama-3.1-8B-Instruct_jsd_b0p8_ema0p999_ep30
Qwen3-8B-ep4_julia_codeforces_extended_with_thinksft_16bit_vllm
Llama-3.1-8B-Instruct-HI-SynthDolly-r16alpha32-E1-S3407
Arguinas-Qwen3-8B-100p-lr3e6
llama3-8b-full-sft-c4-1m-en
Qwen3-8B-rl730_with_think_knowledge_merged
Hypa-Whispering-Llama-3.1-8B
swerl_qwen35_9b_fp32lm_b001_testrun_step100
swerl_qwen35_9b_fp32lm_datamix_step100
Llama3.1-8B-Base-Math-Code
STAR1-R1-Distill-7B-first-token-not-i-step50