Models
11,501
Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint150

Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-7-deberta-nli-reward

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint200

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint50

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint25

Qwen3-1.7B-tldr-bsz128-ts500-ranking1.528-skywork8b-seed42-lr1e-6-warmup10-checkpoint75

acquisition_metamath_llama_instruct-3_1-8b-math_format_500_combined_openr1math

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint275

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint25

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint175

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint250

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint125

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint125

