llama-3.2-1b-code-instruct
da0e8622
PureRL-1.5B-v7-stage1-A-fewshot
Vikhr-Llama-3.2-1B-Instruct-abliterated
Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_2
olympiads_Main_fixed_BaseAnchor_1_5B_step_1
Qwen-docsis-chatbot-model
qwen2.5-abliterated_1.5B_Instruct
e6172e5b
PureRL-1.5B-v6b3-bare-fmt03
sn38-v11-8
PureRL-1.5B-v5-06-uccp
PureRL-1.5B-v5-06-uppl
cnk12_Main_fixed_BaseAnchor_1_5B_step_2
goldengoose-high_div_rand_polar-25grp
qwen2.5-math-1.5b-dpo-gsm8k
PureRL-1.5B-v6d4-lam01-sigmoid-maskoff-acc05
Qwen2.5-1.5B-mn-cpt
Qwen2.5-1.5B-bo-cpt
polyalign-qwen2.5-1.5b-en-sft
SecureFin-SLM-1.5B-Final
Qwen2.5-1.5B-kk-cpt
qwen-2.5-1.5B-instruct-SDFT
RLCR-1.5B-hotpot-rac-lr5e6-accW1
PureRL-1.5B-v6d3-lam01-sigmoid-maskon-acc05
RLCR-1.5B-hotpot-rac
Qwen2.5-1.5B-Instruct-SFT-2-Hop-Nei-Aug-Pubmed
AristaeusAgent
Llama3-1B-psych101
PureRL-1.5B-v7-s2-margin-maskoff
daedalus-designer
PureRL-1.5B-v5-06-umsp
PureRL-1.5B-v6d1-baseline-acc10
PureRL-1.5B-v7-s2-l2-maskoff
queryshield-1.5b
smart-contract-audit-rl-model
daedalus-designer-v2
olympiads_Main_fixed_BaseAnchor_1_5B_step_7
PureRL-1.5B-v5-06-mc2
PureRL-1.5B-v5-06-uentropy
PureRL-1.5B-v7-s2-l1-maskoff