kei0902/fine-tuned-gemma

TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kLicense:gemmaArchitecture:Transformer Gated Cold

kei0902/fine-tuned-gemma is a 2.6 billion parameter language model, fine-tuned from Google's gemma-2-2b-jpn-it architecture. This model was trained with a learning rate of 2e-05 and 3 epochs, utilizing mixed-precision training. Its specific differentiators and primary use cases are not detailed in the available information, as it was fine-tuned on an unknown dataset.

Loading preview...

Model Overview

This model, kei0902/fine-tuned-gemma, is a 2.6 billion parameter language model derived from Google's gemma-2-2b-jpn-it architecture. It has undergone a fine-tuning process, though the specific dataset used for this fine-tuning is currently unknown.

Training Details

The fine-tuning procedure involved several key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: A train_batch_size of 1 and eval_batch_size of 8, with a gradient_accumulation_steps of 8, resulting in a total_train_batch_size of 8.
  • Optimizer: AdamW with default betas and epsilon.
  • LR Scheduler: Linear type.
  • Epochs: Trained for 3 epochs.
  • Mixed Precision: Utilized native AMP for mixed-precision training.

Limitations

Due to the lack of detailed information regarding the fine-tuning dataset and specific objectives, the intended uses and limitations of this particular fine-tuned model are not clearly defined. Further evaluation and understanding of its performance characteristics would be required to determine optimal applications.