4bit/Llama-2-13b-chat-hf

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer0.0K Cold

Llama-2-13b-chat-hf is a 13 billion parameter, fine-tuned generative text model developed by Meta, optimized for dialogue use cases. This model utilizes an optimized transformer architecture and is aligned to human preferences for helpfulness and safety through supervised fine-tuning and reinforcement learning with human feedback. It is designed for commercial and research use in English, excelling in assistant-like chat applications with a 4096-token context length.

Loading preview...

Llama-2-13b-chat-hf Overview

This model is a 13 billion parameter variant from Meta's Llama 2 family, specifically fine-tuned for dialogue. It is built upon an optimized transformer architecture and has undergone extensive supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to enhance its helpfulness and safety for chat applications. The model was trained on a new mix of publicly available online data, totaling 2 trillion tokens, with a data cutoff of September 2022, and some tuning data up to July 2023.

Key Capabilities

  • Dialogue Optimization: Specifically fine-tuned for assistant-like chat interactions.
  • Performance: Outperforms many open-source chat models and is competitive with some closed-source models in human evaluations for helpfulness and safety.
  • Safety Alignment: Incorporates RLHF to align with human preferences, demonstrating strong safety performance on benchmarks like TruthfulQA and ToxiGen.
  • Commercial Use: Available for both commercial and research applications under a custom license from Meta.

Good For

  • Assistant-like Chatbots: Ideal for building conversational AI agents.
  • English Language Applications: Intended for use in English-speaking contexts.
  • Research and Development: Suitable for exploring and developing new natural language generation tasks, especially when adapted from the pretrained versions.