Ejafa/llama_7B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer0.0K Cold

Ejafa/llama_7B is a 7 billion parameter auto-regressive language model, based on the transformer architecture, developed by the FAIR team of Meta AI. This version specifically resolves EOS token issues and is intended for research on large language models, including exploring applications like question answering and natural language understanding. It is a foundational model, primarily performing better in English due to its training data composition.

Loading preview...

Ejafa/llama_7B Model Summary

Ejafa/llama_7B is a 7 billion parameter LLaMA model, developed by Meta AI's FAIR team, designed for research in large language models. This specific version addresses and resolves EOS token issues present in earlier iterations. Trained between December 2022 and February 2023, it is a foundational, auto-regressive language model built on the transformer architecture.

Key Characteristics & Capabilities

  • Architecture: Transformer-based, auto-regressive language model.
  • Parameter Count: 7 billion parameters.
  • Context Length: 4096 tokens.
  • Training Data: Primarily English, with 20 languages included, composed of CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
  • Performance: Evaluated on common sense reasoning benchmarks like BoolQ (76.5%), PIQA (79.8%), and HellaSwag (76.1%).
  • Bias Evaluation: Assessed for biases across categories such as gender, religion, race, and age, with an average bias score of 66.6.

Intended Use Cases

  • Research: Ideal for researchers exploring large language model applications, understanding capabilities and limitations, and developing improvement techniques.
  • Application Exploration: Suitable for investigating potential uses in question answering, natural language understanding, and reading comprehension.
  • Bias & Toxicity Studies: Useful for evaluating and mitigating biases, risks, toxic content generation, and hallucinations in LLMs.

Limitations & Considerations

As a foundational model, Ejafa/llama_7B is not fine-tuned with human feedback and may generate toxic, offensive, or incorrect information. It is under a non-commercial license and requires further risk evaluation and mitigation before deployment in downstream applications, especially those impacting human life.