marcchew/Platyporoni-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

Platyporoni-7B is a fine-tuned language model developed by marcchew, based on the AIDC-ai-business/Marcoroni-7B architecture. This model has undergone specific training with a learning rate of 8e-06 and a batch size of 48 over one epoch, resulting in a final validation loss of 2.7324. While specific parameter count and primary use cases are not detailed, its fine-tuning process suggests potential specialization for tasks aligned with its base model's capabilities.

Loading preview...

Platyporoni-7B Overview

Platyporoni-7B is a fine-tuned language model created by marcchew, building upon the AIDC-ai-business/Marcoroni-7B base model. The specific dataset used for fine-tuning is not detailed in the available information. The model was trained for one epoch using a learning rate of 8e-06, a train_batch_size of 48, and gradient_accumulation_steps of 2, leading to a total_train_batch_size of 96. The training process utilized an Adam optimizer with specific beta and epsilon values.

Training Performance

During its single training epoch, Platyporoni-7B achieved a final validation loss of 2.7324. The training involved 256 steps, with validation loss progressively decreasing from an initial 2.8691 to its final value. Key hyperparameters included:

  • Learning Rate: 8e-06
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Epochs: 1
  • Batch Size: 48 (train), 6 (eval)

Limitations

Detailed information regarding the model's intended uses, specific capabilities, and limitations is not provided in the current documentation. Users should exercise caution and conduct further evaluation to determine its suitability for particular applications, as the training data and specific optimizations remain unspecified.