CharlesLi/llama_2_sky_safe_o1_llama_3_70B_default_1000_1000_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 13, 2025License:llama2Architecture:Transformer Open Weights Cold

The CharlesLi/llama_2_sky_safe_o1_llama_3_70B_default_1000_1000_full model is a 7 billion parameter language model fine-tuned from Meta's Llama-2-7b-chat-hf. This model is specifically fine-tuned on a generator dataset, indicating its optimization for text generation tasks. It is designed for applications requiring robust conversational AI or creative text output, building upon the Llama 2 architecture.

Loading preview...

Model Overview

This model, llama_2_sky_safe_o1_llama_3_70B_default_1000_1000_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf base model. It features 7 billion parameters and has a context length of 4096 tokens. The fine-tuning process focused on a "generator dataset," suggesting its primary strength lies in text generation capabilities.

Training Details

The model was trained with the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: 4 (train), 4 (eval)
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Epochs: 1
  • Distributed Training: Multi-GPU setup with 4 devices and 2 gradient accumulation steps, resulting in a total train batch size of 32.

Potential Use Cases

Given its fine-tuning on a generator dataset, this model is likely suitable for:

  • Text Generation: Creating coherent and contextually relevant text.
  • Conversational AI: Developing chatbots or dialogue systems based on its Llama-2-chat foundation.
  • Content Creation: Assisting with drafting articles, summaries, or creative writing pieces.