hemanth-kj/llama-2-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:llama2Architecture:Transformer Open Weights Cold

hemanth-kj/llama-2-7B is a 7 billion parameter, 4096-token context length generative text model from the Llama 2 family developed by Meta. This model is a pretrained version of Llama 2, an auto-regressive language model using an optimized transformer architecture. It is intended for commercial and research use in English, adaptable for various natural language generation tasks.

Loading preview...

Overview

hemanth-kj/llama-2-7B is a 7 billion parameter model from Meta's Llama 2 family of large language models. It is a pretrained, auto-regressive language model built on an optimized transformer architecture, designed to generate text. The Llama 2 models were trained on a new mix of publicly available online data, totaling 2 trillion tokens, with a data cutoff of September 2022.

Key Capabilities

  • Text Generation: Capable of generating human-like text based on input prompts.
  • Foundation Model: Serves as a base model that can be adapted for various natural language generation tasks.
  • Optimized Architecture: Utilizes an optimized transformer architecture for efficient performance.

Intended Use Cases

  • Commercial and Research: Suitable for both commercial applications and academic research in English.
  • Natural Language Generation: Can be fine-tuned or adapted for a wide range of NLG tasks.

Performance Highlights

Compared to Llama 1 7B, Llama 2 7B shows improvements across several academic benchmarks, including:

  • Code: Improved from 14.1 to 16.8.
  • Commonsense Reasoning: Improved from 60.8 to 63.9.
  • Math: Significantly improved from 6.95 to 14.6.
  • MMLU: Improved from 35.1 to 45.3.

Limitations

  • Language: Primarily intended for use in English.
  • Static Model: Trained on an offline dataset, meaning its knowledge cutoff is September 2022 for pretraining data.
  • Safety: As with all LLMs, it may produce inaccurate, biased, or objectionable responses, requiring developers to perform safety testing and tuning.