Name: opencsg/csg-wukong-1B-sft-bf16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: opencsg

OpenCSG csg-wukong-1B-sft-bf16 Overview

The csg-wukong-1B-sft-bf16 is a 1.1 billion parameter small language model (SLM) developed by OpenCSG. It is a fine-tuned version of the csg-wukong-1B base model, designed to offer a compact yet capable solution for various language processing tasks. OpenCSG's vision emphasizes democratizing generative large models and empowering industries with their own AI capabilities.

Key Characteristics & Performance

Model Size: 1.1 billion parameters, making it suitable for resource-constrained environments or applications requiring faster inference.
Base Model: Fine-tuned from the pre-trained csg-wukong-1B.
Training Details: The model underwent 43 days of training on 16 H800 GPUs, utilizing Deepspeed for orchestration and PyTorch for neural network implementation, with BP16 enabled via Apex.
Leaderboard Ranking: The csg-wukong-1B base model achieved a notable 8th position among approximately 1.5B pretrained small language models on the open_llm_leaderboard, indicating strong performance relative to its size class.

Intended Use Cases

This model is well-suited for applications where a balance between performance and computational efficiency is crucial. Its competitive ranking suggests it can be a strong candidate for:

General text generation and understanding tasks.
Deployment in edge devices or scenarios with limited hardware resources.
As a foundation for further domain-specific fine-tuning.

Overview

OpenCSG csg-wukong-1B-sft-bf16 Overview

Key Characteristics & Performance

Intended Use Cases

Full Model Card (README)