hamxea/Llama-2-7b-chat-hf-activity-fine-tuned-v4

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 15, 2024License:otherArchitecture:Transformer Cold

hamxea/Llama-2-7b-chat-hf-activity-fine-tuned-v4 is a 7 billion parameter auto-regressive language model, based on the Transformer architecture, developed by Meta AI. This version is a Hugging Face conversion of the original Llama-7B, optimized for faster loading with checkpoints saved in 2 shards. It is primarily intended for research in large language models, including exploring applications like question answering and natural language understanding.

Loading preview...

Model Overview

This model, hamxea/Llama-2-7b-chat-hf-activity-fine-tuned-v4, is a 7 billion parameter auto-regressive language model based on the Transformer architecture, originally developed by Meta AI. It is a Hugging Face conversion of the foundational Llama-7B model, specifically updated to work seamlessly with transformers>=4.28.0 and features model checkpoints saved in 2 shards for accelerated loading speeds compared to previous versions.

Key Capabilities & Characteristics

  • Architecture: Transformer-based auto-regressive language model.
  • Parameter Count: 7 billion parameters.
  • Training: Trained between December 2022 and February 2023 on a diverse dataset including CCNet, C4, GitHub, Wikipedia, Books, ArXiv, and Stack Exchange, with a strong emphasis on English content.
  • Performance: Achieves competitive results on common sense reasoning benchmarks, including 76.5 on BoolQ, 79.8 on PIQA, and 76.1 on HellaSwag.
  • Bias Evaluation: Evaluated for biases across categories like gender, religion, race, and age, with an average bias score of 66.6.

Intended Use Cases

This model is primarily intended for research purposes in large language models. Specific research applications include:

  • Exploring potential applications such as question answering, natural language understanding, or reading comprehension.
  • Understanding the capabilities and limitations of current language models.
  • Developing techniques to improve language models, including evaluating and mitigating biases, risks, toxic content generation, and hallucinations.

Note: As a foundational model, it is not intended for direct use in downstream applications without further risk evaluation and mitigation, as it has not been trained with human feedback and may generate unhelpful, incorrect, or offensive content.