Text Generation Models — Page 364
42,815LorenaYannnnnWarmTools800M32K
general_reward-Qwen3-0.6B-OURS_llama-seed_1
TStark12310WarmTools3B32K
NeelectricWarmTools1B32K
Llama-3.2-1B-Instruct_SFT_sciencev00.01
longtermriskWarmTools4B32K
Qwen3-4B-Base-ftjob-0511c5edc14e
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_tok_python_alt_1_per_10_1p0_0p0_1p0_grpo_42_rule
NeelectricWarmTools1B32K
Llama-3.2-1B-Instruct_SFT_sciencefisher_v00.06
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_tok_Certainly_alt_1_per_5_1p0_0p0_1p0_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_tok_Certainly_alt_1_per_10_1p0_0p0_1p0_grpo_42_rule
EulerianKnightWarm3B8K
gemma-2b-pharmacopeia-slm
walter-bdWarmTools800M32K
walter-bdWarmTools800M32K
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_rel_1e-1_1p0_0p0_1p0_grpo_dr_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_1p0_0p1_1p0_grpo_sapo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_rel_1e1_1p0_0p0_1p0_grpo_dr_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_tok_python_1p0_0p0_1p0_grpo_dr_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_rel_1e1_1p0_0p0_1p0_grpo_sapo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_tok_python_1p0_0p0_1p0_grpo_sapo_42_rule
j05hr3dWarmTools1B32K
Llama-3.2-1B-Instruct-C_M_T
ccui46WarmTools9B32K
glmz1_9b_diffPrompt_fullGen_downsampledData_aime_per_chunk_act_glm_3500
achinta3WarmTools3B32K
llama_3.2_3b-owl_numbers_full_ep2
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_mix_alt_Certainly_python_1p0_0p0_1p0_grpo_42_rule
L1nusWarmTools4B32K
qwen3-4B-default-pubmed-art-5000-seq-2048
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_mix_alt_rel_1e0_python_1p0_0p0_1p0_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_mix_all_rel_1e0_python_1p0_0p0_1p0_grpo_42_rule
simonyclWarmTools4B32K
Qwen3-4B-Instruct-2507-InverseIFEval-DPO
jerry070991WarmTools500M32K
chenyongxiWarmTools500M32K
L1nusWarmTools4B32K
qwen3-4B-instruct-pubmed-answer-only-artificial-5000
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_rel_1e-1_alt_oracle1_noisy9_1p0_0p0_1p0_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_0p5_0p0_1p0_grpo_dr_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_0p8_0p0_1p0_grpo_dr_grpo_42_rule
Kazuki1450WarmTools2B32K
Qwen3-1.7B-Base_dsum_3_6_0p8_0p0_1p0_grpo_42_rule
Wlc7758WarmTools33B32K
Deepseek-R1-Distill-Qwen-32b-uncensored
rithesh2005Warm1B2K
TinyLlama-WorkflowOrchestration
ThrillcrazyerWarmTools2B32K
Qwen-2.5-1.5B_TAC_Teacher_LLAMA70
noobmaster6009WarmTools800M32K
Qwen3-0.6B-Gensyn-Swarm-rough_clawed_panther
nealwolfeWarmTools500M32K
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-fluffy_waddling_tarantula