CharlesLi/llama_2_sky_safe_o1_4o_default_1000_500_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

CharlesLi/llama_2_sky_safe_o1_4o_default_1000_500_full is a 7 billion parameter Llama 2-based causal language model, fine-tuned from Meta's Llama-2-7b-chat-hf. This model was fine-tuned on a generator dataset, achieving a loss of 0.7041 on its evaluation set. It is designed for general language generation tasks, leveraging the Llama 2 architecture for conversational and text-based applications.

Loading preview...

Model Overview

CharlesLi/llama_2_sky_safe_o1_4o_default_1000_500_full is a 7 billion parameter language model, fine-tuned from the meta-llama/Llama-2-7b-chat-hf base model. It was specifically trained on a "generator dataset" and achieved an evaluation loss of 0.7041.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: 32 (total train batch size, with 4 per device and 2 gradient accumulation steps)
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio
  • Epochs: 1

Intended Use Cases

While specific intended uses are not detailed, as a fine-tuned Llama 2-7b-chat-hf variant, it is generally suitable for:

  • Text Generation: Creating coherent and contextually relevant text.
  • Conversational AI: Engaging in dialogue and responding to prompts.
  • General Language Tasks: Applications requiring understanding and generation of human-like text.

Further information regarding specific applications, limitations, and training data is not provided in the model card.