Trelis/Llama-2-7b-chat-hf-sharded-bf16

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jul 21, 2023Architecture:Transformer0.0K Cold

Trelis/Llama-2-7b-chat-hf-sharded-bf16 is a sharded, 7 billion parameter version of Meta's Llama 2 Chat model, optimized for dialogue use cases. This auto-regressive language model uses an optimized transformer architecture and was fine-tuned with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. It is intended for commercial and research use in English, excelling in assistant-like chat applications.

Loading preview...

Llama 2 Chat 7B (Sharded)

This model is a sharded version of Meta's Llama 2 Chat 7B, specifically adapted for the Hugging Face Transformers format. Llama 2 is a family of large language models developed by Meta, with this particular variant being a 7 billion parameter model fine-tuned for dialogue.

Key Capabilities

  • Dialogue Optimization: Specifically fine-tuned for chat and assistant-like interactions using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
  • Performance: Outperforms many open-source chat models on various benchmarks and is competitive with some closed-source models in human evaluations for helpfulness and safety.
  • Transformer Architecture: Utilizes an optimized auto-regressive transformer architecture.
  • Commercial and Research Use: Intended for both commercial and research applications in English.

Good for

  • Building English-language chatbots and virtual assistants.
  • Research into dialogue systems and human-aligned AI.
  • Applications requiring a robust, fine-tuned language model for conversational tasks.
  • Leveraging a sharded model for potentially easier deployment or inference in certain environments, such as Google Colab with GPU runtimes.