sharpbai/llama-7b-hf
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer Cold

sharpbai/llama-7b-hf is a 7 billion parameter auto-regressive language model based on the Transformer architecture, derived from Meta AI's LLaMA-7B. This version is specifically converted for compatibility with git head Transformers/HuggingFace and addresses EOS token issues. It is primarily intended for research in large language models, including exploring applications like question answering and natural language understanding, and understanding model capabilities and limitations.

Loading preview...

Overview

sharpbai/llama-7b-hf is a 7 billion parameter LLaMA model, originally developed by Meta AI's FAIR team. This specific repository provides a version converted to work with current HuggingFace Transformers, resolving known EOS token issues. The model is an auto-regressive language model built on the transformer architecture, trained between December 2022 and February 2023.

Key Capabilities

  • Research Focus: Primarily designed for research into large language models, including understanding their capabilities, limitations, and potential applications such as question answering and natural language understanding.
  • Multilingual Data: While predominantly English, the training data included 20 languages (bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk) from sources like Wikipedia and Books.
  • Performance Benchmarks: Evaluated on common sense reasoning tasks (BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC, OpenBookQA, COPA) and showed competitive performance for its size, achieving 76.5% on BoolQ and 93% on COPA.
  • Bias Evaluation: Assessed for biases across categories like gender, religion, race, and age, with an average bias score of 66.6.

Intended Use Cases

  • Academic Research: Ideal for researchers in natural language processing, machine learning, and artificial intelligence to study LLM behavior, develop new techniques, and evaluate biases.
  • Foundation Model: Serves as a base model for further fine-tuning and development, though it requires additional risk evaluation and mitigation for downstream applications due to its foundational nature and lack of human feedback training.

Limitations

  • Non-Commercial License: The model is distributed under a non-commercial bespoke license, restricting its use to research purposes.
  • Potential for Harmful Content: As a base model trained on web data, it may generate toxic, offensive, or incorrect information and is not suitable for applications requiring high accuracy or safety without further mitigation.