minchaoh2002/Qwen3-8B-pragrest-margin-0.8-qa-only-kl-0.02-lr-4e-6_step_21

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 30, 2026Architecture:Transformer Warm

Loading preview...