Vivian12300/mmlu_same_f_llama2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 18, 2024License:llama2Architecture:Transformer Open Weights Cold

The Vivian12300/mmlu_same_f_llama2 is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. This model is specifically adapted from its base architecture for specialized tasks, though its primary differentiators and intended uses require further documentation. It was trained with a focus on specific dataset generation, indicating potential for focused applications rather than broad general-purpose use.

Loading preview...

Overview

This model, mmlu_same_f_llama2, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf architecture. It features 7 billion parameters and was developed by Vivian12300. The fine-tuning process involved a specific "generator dataset," suggesting an optimization for particular data generation or transformation tasks.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 5e-05
  • Batch Size: 1 (train), 2 (eval)
  • Gradient Accumulation Steps: 16 (resulting in a total effective batch size of 16)
  • Optimizer: Adam with standard betas and epsilon
  • LR Scheduler: Linear
  • Epochs: 30

Key Characteristics

  • Base Model: Llama-2-7b-chat-hf
  • Parameter Count: 7 billion
  • Fine-tuning Focus: Generator dataset, implying specialized output generation.

Limitations

As per the provided documentation, specific intended uses, limitations, and detailed training/evaluation data require further information. Users should exercise caution and conduct thorough testing for any specific application.