yahma/llama-7b-hf

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 8, 2023License:otherArchitecture:Transformer0.1K Cold

The yahma/llama-7b-hf model is a 7 billion parameter auto-regressive language model, based on the Transformer architecture, developed by the FAIR team of Meta AI. This version is a conversion of the original LLaMA-7B model, updated for compatibility with HuggingFace Transformers and resolving EOS token issues. Primarily intended for research, it serves as a foundational model for exploring applications like question answering and natural language understanding, with a context length of 4096 tokens.

Loading preview...

LLaMA-7B: A Foundational Model for LLM Research

The yahma/llama-7b-hf model is a 7 billion parameter variant of the original LLaMA (Large Language Model Meta AI) series, developed by Meta AI's FAIR team. This specific release is a conversion optimized for use with HuggingFace Transformers, addressing initial EOS token compatibility issues. LLaMA models are auto-regressive language models built on the Transformer architecture, designed to facilitate research into large language models.

Key Capabilities & Characteristics

  • Architecture: Transformer-based, auto-regressive language model.
  • Parameter Count: 7 billion parameters, part of a family including 13B, 33B, and 65B versions.
  • Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
  • Multilingual Support: While predominantly English, the training data includes content from 20 languages, suggesting some multilingual capability.
  • Research Focus: Intended for research in natural language processing, machine learning, and AI, particularly for understanding LLM capabilities, limitations, and developing mitigation techniques for biases and harmful content.

Intended Use Cases

  • LLM Research: Ideal for exploring applications such as question answering, natural language understanding, and reading comprehension.
  • Bias and Risk Evaluation: Useful for evaluating and mitigating biases, risks, toxic content generation, and hallucinations in language models.
  • Foundational Model Development: Serves as a base model for further fine-tuning and application-specific development, though it requires additional risk evaluation and mitigation for downstream uses due to its base nature and lack of human feedback training.