fjxdaisy/qwen-2.5-0.5b-instruct-verl-gsm8k-sft-lr2e-5-1epoch

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kArchitecture:Transformer Warm

Loading preview...