sundar-pichai/llama-2-7b
Llama 2 7B is a 7 billion parameter fine-tuned generative text model developed by Meta, optimized for dialogue use cases. It utilizes an optimized transformer architecture and has a context length of 4096 tokens. This model is specifically designed for assistant-like chat applications and outperforms many open-source chat models on benchmarks for helpfulness and safety.
Loading preview...
Overview
Meta's Llama 2 is a family of large language models, with this specific variant being the 7 billion parameter fine-tuned model. It is optimized for dialogue use cases, making it suitable for assistant-like chat applications. The model is built on an optimized transformer architecture and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Key Capabilities
- Dialogue Optimization: Specifically fine-tuned for chat and conversational AI, outperforming many open-source chat models in human evaluations for helpfulness and safety.
- Robust Architecture: Employs an optimized transformer architecture, trained on 2 trillion tokens of publicly available data.
- Safety Alignment: Incorporates RLHF to enhance safety, achieving 0.00% toxic generations on the ToxiGen benchmark for the 7B chat model.
- Commercial Use: Available for both commercial and research use, governed by a custom Meta license.
Good for
- Assistant-like Chatbots: Its fine-tuning makes it highly effective for building conversational agents.
- English-language Applications: Intended primarily for commercial and research use in English.
- Benchmarking: Offers strong performance on academic benchmarks, including improvements over Llama 1 across various categories like Code, Commonsense Reasoning, and MMLU.