Name: whynlp/tinyllama-zh API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: whynlp

Overview

whynlp/tinyllama-zh is a Llama-architecture model developed by whynlp, specifically pretrained on Chinese corpora. This project primarily functions as a demonstration of how to pretrain a TinyLlama model using the Hugging Face transformers library on a large dataset. It is fine-tuned from a TinyLlama-2.5T checkpoint.

Key Training Details

Dataset: WuDaoCorpora Text, comprising approximately 45 billion tokens.
Training Epochs: 2 epochs.
Training Duration: Approximately 6 days using 8 A100 GPUs.
Tokenizer: Employs the THUDM/chatglm3-6b tokenizer.
License: MIT.

Intended Use and Limitations

This model is designed to illustrate the pretraining process on a large corpus. While functional, its performance is noted as not being very strong, with a CMMLU result slightly above 25. For better performance in practical applications, the developers suggest using a higher-quality corpus like Wanjuan. Therefore, it is best suited for researchers and developers interested in understanding the mechanics of pretraining TinyLlama on Chinese data rather than for high-performance production use cases.

Overview

Overview

Key Training Details

Intended Use and Limitations

Full Model Card (README)