tomdeore/nonymus-llm

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The tomdeore/nonymus-llm is a fine-tuned language model based on the Llama 2 7B architecture, specifically utilizing the abhishek/llama-2-7b-hf-small-shards variant. This model was trained with a learning rate of 0.0002 over 3 epochs, using Adam optimizer. Further details on its specific capabilities, training dataset, and intended uses are currently unspecified.

Loading preview...

Overview

The tomdeore/nonymus-llm is a fine-tuned language model derived from the Llama 2 7B architecture, specifically leveraging the abhishek/llama-2-7b-hf-small-shards base model. While the specific dataset used for fine-tuning is currently unknown, the training process involved standard hyperparameters.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 0.0002
  • Batch Size: 8 (for both training and evaluation)
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Epochs: 3.0
  • Frameworks: Transformers 4.33.2, Pytorch 2.0.1, Tokenizers 0.13.3

Current Status and Limitations

As of now, detailed information regarding the model's specific capabilities, intended uses, and the characteristics of its training and evaluation data is not available. Users should be aware that without this information, its performance and suitability for particular tasks are undefined. Further details are needed to assess its unique differentiators or optimal applications.