Models
39,176
Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-8-deberta-nli-reward

Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-7-deberta-nli-reward

Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-5-deberta-nli-reward

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint275

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint25

Qwen2.5-7B-Instruct-es-em-bad-medical-advice-epoch-1-deberta-nli-reward

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint175

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint250

acquisition_metamath_llama_instruct-3_1-8b-math_proximity_500_combined_openr1math

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-qrm-seed42-lr1e-6-warmup10-checkpoint75

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint150

Qwen3-1.7B-ultrachat-bsz128-ts300-regular-skywork8b-seed42-lr1e-6-warmup10-checkpoint300