sundar-pichai/llama-2-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Llama 2 7B is a 7 billion parameter fine-tuned generative text model developed by Meta, optimized for dialogue use cases. It utilizes an optimized transformer architecture and has a context length of 4096 tokens. This model is specifically designed for assistant-like chat applications and outperforms many open-source chat models on benchmarks for helpfulness and safety.

Loading preview...

Overview

Meta's Llama 2 is a family of large language models, with this specific variant being the 7 billion parameter fine-tuned model. It is optimized for dialogue use cases, making it suitable for assistant-like chat applications. The model is built on an optimized transformer architecture and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Dialogue Optimization: Specifically fine-tuned for chat and conversational AI, outperforming many open-source chat models in human evaluations for helpfulness and safety.
  • Robust Architecture: Employs an optimized transformer architecture, trained on 2 trillion tokens of publicly available data.
  • Safety Alignment: Incorporates RLHF to enhance safety, achieving 0.00% toxic generations on the ToxiGen benchmark for the 7B chat model.
  • Commercial Use: Available for both commercial and research use, governed by a custom Meta license.

Good for

  • Assistant-like Chatbots: Its fine-tuning makes it highly effective for building conversational agents.
  • English-language Applications: Intended primarily for commercial and research use in English.
  • Benchmarking: Offers strong performance on academic benchmarks, including improvements over Llama 1 across various categories like Code, Commonsense Reasoning, and MMLU.