Xinging/aigc_statement_1m_z3_bs8_pt

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 19, 2025License:otherArchitecture:Transformer Cold

Xinging/aigc_statement_1m_z3_bs8_pt is an 8 billion parameter language model fine-tuned from Meta Llama 3.1. It was trained on the aigc_statement_1m_pre-training dataset, suggesting a specialization in tasks related to AIGC (AI-Generated Content) statements. This model leverages a 32K context window and is optimized for specific applications within the AIGC domain.

Loading preview...

Model Overview

Xinging/aigc_statement_1m_z3_bs8_pt is an 8 billion parameter language model, fine-tuned from the robust meta-llama/Llama-3.1-8B base model. This fine-tuning process utilized the aigc_statement_1m_pre-training dataset, indicating a specialized focus on tasks related to AI-Generated Content (AIGC) statements. The model operates with a context length of 32,768 tokens.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: 8 (train and eval), resulting in a total effective batch size of 32 across 4 GPUs.
  • Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08.
  • LR Scheduler: Cosine type with a warmup ratio of 0.01.
  • Epochs: 1.0

Potential Use Cases

Given its fine-tuning on an AIGC-specific dataset, this model is likely best suited for applications involving:

  • Generating or analyzing statements related to AI-generated content.
  • Tasks requiring understanding or creation of text within the AIGC domain.

Further details on specific intended uses, limitations, and comprehensive training/evaluation data are not provided in the current model card.