regulus4869/ppo_trained_model_gsm8k_ppo_500examples

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Warm

Loading preview...