New AI Models (Last Year) — Page 515
22,584Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p25_0p75_1p0_0p0_1p0_grpo_42_rule
Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p25_0p50_1p0_0p0_1p0_grpo_42_rule
Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p5_0p75_1p0_0p0_1p0_grpo_42_rule
Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_0p5_1p0_1p0_0p0_1p0_grpo_42_rule
Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_geq_8_geq_8_1p0_0p75_1p0_0p0_1p0_grpo_42_rule
rrvaswinColdTools1B32K
DAPO_GRPO_16b_incorrect_bs_32_mb_8_n16_cliphigh
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-50-7.5e-6
narabzadColdTools33B32K
s1K-1.1_tokenized-fromHF-githubcode-torchrun
AznaurColdTools8B32K
tbench-qwen-sft-multitask-clean-v10
AznaurColdTools8B32K
tbench-qwen-sft-multitask-nat-v11
lucasaidevColdTools14B32K
Affine-5GRCUvyeR5sHNFjWGXbW8A5vbJWtBUr8qa5mK8fDd6uspNm9
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-10
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-40
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-50
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-70
siruilColdTools8B32K
appworld-agent-8B-no-think-new-agent-multilock-dev-0122-global-step-700
siruilColdTools14B32K
appworld-agent-14B-distillation-sft-v2-no-think-new-agent-multilock-dev-0120-global-step-450
LegendaryDawnColdTools8B32K
erpo-iclr-baseline-Qwen2.5-7b-DAPO-step180
LegendaryDawnColdTools8B32K
erpo-iclr-ours-Qwen2.5-7b-corr_gen_s005_max14
sagnikMColdTools8B32K
grpo_rmsprop_qwen3-8b_3k_seqlen
curli12ColdTools14B32K
Affine-28-5FZNvCq99HQubesSSKumcEfmXckRhHadCw7sPf6Zq9gUnoxr
unint64ColdTools8B32K
affine-5GBNudFhZHk9otd247XQhLiR8AwYLJynvpMHnXpN1CD3rFzD
liyiming986ColdTools12B32K
Kazuki1450ColdTools2B32K
Qwen3-1.7B-Base_csum_6_10_tok_aligned_1p0_0p0_1p0_grpo_42_rule
Kazuki1450ColdTools2B32K
Qwen2.5-1.5B-Instruct_csum_6_10_tok_first_1p0_0p0_1p0_grpo_42_rule
dikcejColdTools8B8K
llama3-hukum-indo-forrag-v1
HahmdongColdTools8B32K
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-30
liyiming986ColdTools12B32K
rawcellColdTools8B32K
Qwen2.5-Coder-7B-Instruct-bruno
HarethahMoColdTools8B8K
AraGuard-8B-v2-checkpoint
AznaurColdTools8B32K
tbench-qwen-sft-combined-nat-pro-v1
narabzadColdTools33B32K
train_s1k_queries_on_s1_decontam_jaccard_13_test_template2.deepseek_all_full-checkpoint-625
mbakgunColdTools15B32K
Qwen2.5-Coder-14B-n8n-Workflow-Generator-merged-hf
claustrophobicColdTools14B32K
Affine-war-5E7staNhMMEq6yzwx8F2hNPJ6SWvGvbvAv4RsXwQ3bNV65cQ
rhuanmatiasColdTools14B32K
Affine-01-old-2-5EALnKDFv8qkqERMbTFoZWz2BBofuti1zRuvcRq1JCT81rdJ