ikura31/mistral_docs_sum_p1_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 8, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The ikura31/mistral_docs_sum_p1_full model is a 7 billion parameter language model fine-tuned from Mistralai's Mistral-7B-Instruct-v0.1. It was trained for one epoch with a learning rate of 3.6e-05 and achieved a final validation loss of 0.5829. This model is intended for tasks related to document summarization, though specific details on its training dataset and intended uses are not provided.

Loading preview...

Model Overview

The ikura31/mistral_docs_sum_p1_full is a 7 billion parameter language model, fine-tuned from the mistralai/Mistral-7B-Instruct-v0.1 base architecture. The fine-tuning process involved a single epoch with a learning rate of 3.6e-05, utilizing an Adam optimizer and Native AMP for mixed-precision training. The model achieved a final validation loss of 0.5829.

Training Details

  • Base Model: Mistral-7B-Instruct-v0.1
  • Parameters: 7 Billion
  • Epochs: 1
  • Learning Rate: 3.6e-05
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Batch Size: 4 (train and eval)
  • Final Validation Loss: 0.5829

Limitations

Specific details regarding the dataset used for fine-tuning, intended use cases, and known limitations are not provided in the model card. Users should exercise caution and conduct further evaluation to determine suitability for specific applications, particularly for document summarization tasks implied by the model's name.