g1_top8_diverse_100000_32b_step3300__Qwen3-32B
QWiki-1.7B-base-LR1e5-b32g2gc8-order-batch-filtered
v041-R1e
llama3.1-8b-base-lr1e-5-gsm8k-safedelta-scale0.1
train_qnli_42_1779286680
babyai-world-model-7B-sft
Qwen2.5-1.5B-Indonesian-Assistant-GRPO
clarify-rl-grpo-qwen3-1-7b-run7
s7g358gt
AU-extraction_Qwen2.5-7B-Instruct
acrs-qwen-3b-rl
FinSenti-Qwen3-0.6B
FAME_PO_llama32-1b-1p25-instruct-qa
bodh-merged-v1
expfinal-qwen-mbpp-s42-base
qwen3-14b-soil-full-model
Qwen3-4B-Instruct-2507-ScaleSWE-Distilled-Epoch3
multilingual_model
qwen3-0.6b-math-l45-qlora-merged-fp16-v2
qwen2-5-1-5b-instruct-abliterated
cnk12_Main_fixed_SFTanchor_1_5B_step_4
P12-frac0p05-fullft-lr1e5-ep6
qwen05-resume-job-match-evaluator
olympiads_Main_fixed_BaseAnchor_3B_step_5
Agent_4b
qwen3_8b_finch_all_local_hard_without_held_out_expr_purpose_1.0e-5_2.0_train42_cosine
Qwen3-8B
tezos100k_continue_gptlongtezos__Qwen3-32B
bell-motor
Sera-4.6-Lite-T2-v4-316-axolotl__Qwen3-8B-v2
dpg-financial-sentiment-generator-f1-v2
FinSenti-Qwen3-4B
Llama-HISEMOTIONS-1e-4_merged
Llama3.2-1B-ThinkMix-Full
Qwen3-4B-GRPO-sft
Llama3.1-8B-Base-Math-Code
qwen-hf-fewshot-iter-np-iter2
attention-guard-grpo
FAME_gold_llama32-1b-5-instruct-qa
FAME_gold_llama32-1b-10-instruct-qa
dpo-qwen-cot-merged
codementor-v2-fullstack