CharlesLi/llama_2_alpaca_llama_2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 31, 2024License:llama2Architecture:Transformer Open Weights Cold

CharlesLi/llama_2_alpaca_llama_2 is a 7 billion parameter language model fine-tuned from Meta's Llama-2-7b-chat-hf. This model was fine-tuned on an unspecified dataset, achieving a validation loss of 0.7577. It is intended for general conversational AI tasks, building upon the Llama 2 architecture with a 4096 token context length.

Loading preview...

Model Overview

CharlesLi/llama_2_alpaca_llama_2 is a 7 billion parameter language model derived from Meta's Llama-2-7b-chat-hf. It has been fine-tuned on an unspecified dataset, demonstrating a final validation loss of 0.7577 after 50 training steps.

Key Training Details

  • Base Model: meta-llama/Llama-2-7b-chat-hf
  • Parameters: 7 Billion
  • Context Length: 4096 tokens
  • Learning Rate: 0.0002
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Training Steps: 50
  • Frameworks: PEFT 0.12.0, Transformers 4.44.2, Pytorch 2.4.1+cu121

Intended Uses

This model is suitable for general-purpose conversational AI applications, leveraging the robust capabilities of the Llama 2 architecture. Its fine-tuning process suggests an adaptation for instruction-following or chat-based interactions, though specific use cases are not detailed in the original documentation.