Overview
hemanth-kj/llama-2-7B is a 7 billion parameter model from Meta's Llama 2 family of large language models. It is a pretrained, auto-regressive language model built on an optimized transformer architecture, designed to generate text. The Llama 2 models were trained on a new mix of publicly available online data, totaling 2 trillion tokens, with a data cutoff of September 2022.
Key Capabilities
- Text Generation: Capable of generating human-like text based on input prompts.
- Foundation Model: Serves as a base model that can be adapted for various natural language generation tasks.
- Optimized Architecture: Utilizes an optimized transformer architecture for efficient performance.
Intended Use Cases
- Commercial and Research: Suitable for both commercial applications and academic research in English.
- Natural Language Generation: Can be fine-tuned or adapted for a wide range of NLG tasks.
Performance Highlights
Compared to Llama 1 7B, Llama 2 7B shows improvements across several academic benchmarks, including:
- Code: Improved from 14.1 to 16.8.
- Commonsense Reasoning: Improved from 60.8 to 63.9.
- Math: Significantly improved from 6.95 to 14.6.
- MMLU: Improved from 35.1 to 45.3.
Limitations
- Language: Primarily intended for use in English.
- Static Model: Trained on an offline dataset, meaning its knowledge cutoff is September 2022 for pretraining data.
- Safety: As with all LLMs, it may produce inaccurate, biased, or objectionable responses, requiring developers to perform safety testing and tuning.