Name: opencsg/csg-wukong-1B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: opencsg

Model Overview

The csg-wukong-1B is a 1.1 billion-parameter Small Language Model (SLM) developed by OpenCSG. OpenCSG's vision is to democratize generative large models, making them accessible for every industry, company, and individual through open-source principles.

Key Capabilities & Performance

Compact and Efficient: With 1.1 billion parameters, it offers a balance between size and performance, making it suitable for resource-constrained environments.
Extensive Pretraining: The model was pretrained on a substantial dataset of 1 trillion tokens, contributing to its language understanding and generation capabilities.
Competitive Ranking: It has demonstrated strong performance on the open_llm_leaderboard, ranking 8th among pretrained SLMs in the ~1.5B parameter class.

Training Details

The csg-wukong-1B was trained over 43 days using 16 H800 GPUs. The training leveraged Deepspeed for orchestration, PyTorch as the deep learning framework, and Apex for BP16 precision, highlighting a robust and optimized training pipeline.

Use Cases

This model is well-suited for applications where a smaller footprint is critical, but effective language processing is still required. Its competitive performance among SLMs suggests its utility in various downstream tasks.

Overview

Model Overview

Key Capabilities & Performance

Training Details

Use Cases

Full Model Card (README)