dfurman/LLaMA-13B

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 17, 2023License:otherArchitecture:Transformer0.0K Cold

LLaMA-13B is a 13 billion parameter causal decoder-only language model developed by the FAIR team at Meta AI, trained on a 1 trillion token corpus including CCNet, C4, GitHub, and Wikipedia. This base model is primarily intended for research in large language models, focusing on exploring applications, understanding capabilities and limitations, and evaluating biases. It supports text generation and is designed for researchers in NLP, machine learning, and AI.

Loading preview...

LLaMA-13B: A Foundation Model for LLM Research

LLaMA-13B is a 13 billion parameter causal decoder-only language model developed by the FAIR team at Meta AI. This model was trained on a substantial 1 trillion token corpus, encompassing diverse datasets such as CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%). The training data includes content in 20 languages, though English constitutes the majority, suggesting better performance for English-language tasks.

Key Capabilities & Characteristics

  • Base Model: LLaMA-13B is a foundational model, meaning it is designed for further fine-tuning and research rather than direct deployment in downstream applications.
  • Research Focus: Its primary intended use is for research into large language models, including exploring applications like question answering and natural language understanding, evaluating model capabilities and limitations, and studying biases, risks, and harmful content generation.
  • Training Data: The model's extensive training on a diverse web-sourced dataset provides a broad understanding of language patterns.
  • Non-Commercial License: The model is released under a bespoke non-commercial license, restricting its use to research and non-commercial purposes.

When to Use This Model

  • LLM Researchers: Ideal for researchers in natural language processing, machine learning, and artificial intelligence who are investigating foundational language models.
  • Bias and Safety Studies: Suitable for evaluating and mitigating biases, risks, and the generation of toxic or harmful content in LLMs.
  • Exploratory Applications: Can be used to explore potential applications such as question answering or reading comprehension, with the understanding that further fine-tuning and risk assessment are required for practical deployment.

Note: As a base model, LLaMA-13B has not been trained with human feedback and may generate toxic, offensive, or incorrect information. It requires additional fine-tuning and risk evaluation before being used in production or sensitive applications.