Xinging/aigc_statement_1m_z3_bs8_pt
Xinging/aigc_statement_1m_z3_bs8_pt is an 8 billion parameter language model fine-tuned from Meta Llama 3.1. It was trained on the aigc_statement_1m_pre-training dataset, suggesting a specialization in tasks related to AIGC (AI-Generated Content) statements. This model leverages a 32K context window and is optimized for specific applications within the AIGC domain.
Loading preview...
Model Overview
Xinging/aigc_statement_1m_z3_bs8_pt is an 8 billion parameter language model, fine-tuned from the robust meta-llama/Llama-3.1-8B base model. This fine-tuning process utilized the aigc_statement_1m_pre-training dataset, indicating a specialized focus on tasks related to AI-Generated Content (AIGC) statements. The model operates with a context length of 32,768 tokens.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 8 (train and eval), resulting in a total effective batch size of 32 across 4 GPUs.
- Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08.
- LR Scheduler: Cosine type with a warmup ratio of 0.01.
- Epochs: 1.0
Potential Use Cases
Given its fine-tuning on an AIGC-specific dataset, this model is likely best suited for applications involving:
- Generating or analyzing statements related to AI-generated content.
- Tasks requiring understanding or creation of text within the AIGC domain.
Further details on specific intended uses, limitations, and comprehensive training/evaluation data are not provided in the current model card.