Qwen3-0.6B-PT-SynthDolly-1A-E5
OsmosisProofling-SFT-NT-GRPO-NT
Qwen3-4B-TL-SynthDolly-1A-E5
Qwen3-4B-ES-SynthDolly-1A-E8
Qwen3-4B-GRPO-math-reasoning
Qwen3-4B-pira-IRM-QA-ep3-qairm
sqlenv-qwen3-1.7b-grpono-no-thinking
Qwen3-4B-TL-SynthDolly-1A-E3
Miner-4B
Miner-8B
parser_model_ner_4.8
GLM-4_6-gemini25flash-stackexchange-overflow-32ep-512k-fixeps
FlaffyTail-Reactive4B
PeaceKeeper-4B-V2
unsup-Qwen3-8B-datav3-only_mask
Qwen3-0.6B-Base-CPT-Math
qwen-dapo-17k-vs
diallm-qwen-grpo-ind
Qwen3-0.6B-finetuned-astro_horoscope
tft-benchmark-s3-direct-Qwen3-1.7B
Qwen3-0.6B-judge
diallm-qwen-grpo-brit
Qwen3-1.7B-teacher-refusal-integer
qwen3-8b-base-sft-hh-harmless-4xh200-batch-64
Meet7.5_0.6b_Writer_Exp
job-radar-qwen3-4b-posttrain-dpo
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_4000
qwen3-4b-it-2507-sft-2018-2022-rl-step-10
halluci-mate-v1a
qwen3-8b-psychai-merged
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_1000
scot0500s-qwen3-14b-full
merged_beat_champ_3model_ties
merged_beat_champ_2model_slerp
e1_gpt_long_sandboxes_2x_tacc-Qwen3-8B
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_2000
merged_beat_champ_3model_dare
cookingworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_tformerPin_4000
g1_weighted_31600_8b_orig
smaller-grapher-with-less-parameters
Qwen3-4B-Data-Science-Insight-TR-7.6K
Qwen3-1.7B-Base