Name: tzchen07/SG_X9e API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tzchen07

Model Overview

tzchen07/SG_X9e is a 2.6 billion parameter language model, building upon the base architecture of jxm/shieldgemma-2b. It has been fine-tuned using a series of five specific datasets: v1.6, v1.6b, v1.6c, v1.6d, and v1.6e. This fine-tuning process aimed to adapt the model for particular applications, though specific details on its enhanced capabilities are not provided in the original documentation.

Training Details

The model was trained with a learning rate of 5e-06, a batch size of 4, and a gradient accumulation of 16, resulting in an effective total batch size of 64. The AdamW_Torch_Fused optimizer was utilized, and the training spanned 2 epochs with a cosine learning rate scheduler and a 0.1 warmup ratio. The training environment included Transformers 4.57.1, Pytorch 2.10.0+cu129, Datasets 4.0.0, and Tokenizers 0.22.2.

Intended Use

While specific intended uses and limitations are not detailed, its foundation on the ShieldGemma architecture and fine-tuning on multiple datasets suggest potential for specialized language understanding and generation tasks. Developers should consider its 2.6B parameter size and 8192-token context length for applications where these specifications are suitable.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)