CharlesLi/llama_2_sky_safe_o1_llama_3_70B_reflect_1000_100_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

CharlesLi/llama_2_sky_safe_o1_llama_3_70B_reflect_1000_100_full is a 7 billion parameter language model fine-tuned from Meta's Llama-2-7b-chat-hf. This model was trained on a generator dataset, achieving a loss of 0.7303 on its evaluation set. It is intended for general language generation tasks, building upon the Llama 2 architecture. Further specific details on its intended uses and limitations are not provided.

Loading preview...

Model Overview

This model, llama_2_sky_safe_o1_llama_3_70B_reflect_1000_100_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf base model. It features 7 billion parameters and was developed by CharlesLi.

Key Training Details

  • Base Model: meta-llama/Llama-2-7b-chat-hf
  • Fine-tuning Dataset: Generator dataset
  • Evaluation Loss: 0.7303
  • Hyperparameters:
    • Learning Rate: 2e-05
    • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
    • Epochs: 1
    • Total Train Batch Size: 32 (across 4 GPUs with 2 gradient accumulation steps)

Intended Uses & Limitations

Specific intended uses and limitations are not detailed in the provided information. Users should consider the general capabilities of a Llama 2-based model for language generation tasks, while acknowledging the lack of explicit guidance on its specialized applications or constraints.