yzhuang/Llama-2-7b-chat-hf_fictional_arc_easy_english_v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 26, 2024License:llama2Architecture:Transformer Open Weights Cold

The yzhuang/Llama-2-7b-chat-hf_fictional_arc_easy_english_v3 model is a 7 billion parameter Llama-2-chat-hf variant, fine-tuned by yzhuang. This model is based on Meta's Llama-2 architecture and is specifically adapted from the meta-llama/Llama-2-7b-chat-hf base model. It is designed for conversational applications, leveraging its 4096-token context length for engaging in chat-based interactions.

Loading preview...

Model Overview

This model, yzhuang/Llama-2-7b-chat-hf_fictional_arc_easy_english_v3, is a fine-tuned version of Meta's 7 billion parameter Llama-2-chat-hf model. Developed by yzhuang, it leverages the robust Llama-2 architecture, known for its strong performance in conversational AI.

Key Characteristics

  • Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
  • Parameter Count: 7 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a 4096-token context window, suitable for maintaining coherent and extended conversations.

Training Details

The model was trained with the following hyperparameters:

  • Learning Rate: 5e-05
  • Batch Sizes: train_batch_size of 1, eval_batch_size of 2, with a gradient_accumulation_steps of 8, resulting in a total_train_batch_size of 8.
  • Optimizer: Adam with default betas and epsilon.
  • Scheduler: Linear learning rate scheduler.
  • Epochs: Trained for 18 epochs.

Intended Use

While specific intended uses are not detailed in the provided information, its fine-tuning from a chat-optimized base suggests suitability for conversational AI applications, chatbots, and interactive text generation tasks, particularly those requiring easy English communication.