fjxdaisy/qwen-2.5-0.5b-instruct-verl-gsm8k-sft-lr1e-4-1epoch

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kArchitecture:Transformer Warm

Loading preview...