Name: amdnsr/llama-7b-hf API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: amdnsr

Overview

The amdnsr/llama-7b-hf is a 7 billion parameter version of the LLaMA (Large Language Model Meta AI) foundational model, developed by Meta AI's FAIR team. Trained between December 2022 and February 2023, this model is an auto-regressive language model built upon the Transformer architecture. It has been converted to be compatible with the HuggingFace Transformers library.

Key Capabilities

Research on Large Language Models: Primarily designed for academic and research purposes to explore applications, understand limitations, and develop improvements for LLMs.
Common Sense Reasoning: Demonstrates strong performance on benchmarks such as BoolQ (76.5%), HellaSwag (76.1%), and COPA (93%).
Natural Language Understanding: Evaluated for tasks like reading comprehension and general NLU, with results on MMLU and BIG-bench hard.
Bias Evaluation: The model's biases related to gender, religion, race, sexual orientation, age, nationality, disability, physical appearance, and socioeconomic status have been evaluated.

Training and Data

The LLaMA 7B model was trained on 1 trillion tokens from a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%). The training data included content in 20 languages, though English constitutes the majority, suggesting better performance for English tasks.

Intended Use Cases

This model is a base, or foundational, model and is intended for researchers in natural language processing, machine learning, and artificial intelligence. It is suitable for:

Exploring potential applications like question answering and reading comprehension.
Understanding the capabilities and limitations of current language models.
Developing techniques to improve model performance and mitigate issues like bias, toxicity, and hallucinations.

Out-of-scope uses include direct deployment in downstream applications without further risk evaluation and mitigation, as it has not been trained with human feedback and may generate toxic, offensive, or incorrect information.

Overview

Overview

Key Capabilities

Training and Data

Intended Use Cases

Full Model Card (README)