Ejafa/llama_13B
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer0.0K Cold

Ejafa/llama_13B is a 13 billion parameter auto-regressive language model developed by the FAIR team of Meta AI, based on the transformer architecture with a 4096-token context length. This version specifically resolves EOS token issues and is intended primarily for research into large language models, including exploring applications like question answering and natural language understanding. It excels in common sense reasoning tasks, demonstrating strong performance across various benchmarks.

Loading preview...

Model Overview

Ejafa/llama_13B is a 13 billion parameter auto-regressive language model developed by Meta AI's FAIR team, part of the LLaMA family of models. Trained between December 2022 and February 2023, this version specifically addresses EOS token issues. It is built on the transformer architecture and was trained on a diverse dataset including CCNet, C4, GitHub, Wikipedia, Books, ArXiv, and Stack Exchange, with a significant portion of English text.

Key Capabilities

  • Research Foundation: Primarily intended for research in large language models, focusing on understanding capabilities, limitations, and developing improvements.
  • Common Sense Reasoning: Demonstrates strong performance on various common sense reasoning benchmarks such as BoolQ, PIQA, SIQA, HellaSwag, and WinoGrande.
  • Multilingual Data: While predominantly English, the training data included 20 languages, suggesting some multilingual understanding.
  • Bias Evaluation: Evaluated for biases across gender, religion, race, sexual orientation, age, nationality, disability, physical appearance, and socio-economic status.

Intended Use Cases

  • Exploring Applications: Suitable for exploring potential applications like question answering, natural language understanding, and reading comprehension.
  • Model Analysis: Ideal for researchers studying the capabilities and limitations of current language models.
  • Bias and Toxicity Research: Useful for evaluating and mitigating biases, risks, toxic content generation, and hallucinations in LLMs.

Limitations

As a base model, LLaMA has not been trained with human feedback and may generate toxic, offensive, or incorrect information. It is not recommended for downstream applications without further risk evaluation and mitigation.