opencsg/csg-wukong-1B-sft-bf16
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The csg-wukong-1B-sft-bf16 is a 1.1 billion parameter small language model developed by OpenCSG, fine-tuned from the csg-wukong-1B base model. This model is optimized for general language tasks and has demonstrated competitive performance among 1.5B pretrained small language models. It was trained using 16 H800 GPUs over 43 days, leveraging Deepspeed and PyTorch.

Loading preview...