Jeppo/Llama-2-13B-chat

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Jeppo/Llama-2-13B-chat is a 13 billion parameter fine-tuned generative text model developed by Meta, part of the Llama 2 family. Optimized for dialogue use cases, it utilizes an auto-regressive transformer architecture with supervised fine-tuning and reinforcement learning with human feedback. This model is designed for commercial and research use in English, excelling in assistant-like chat applications with a 4k token context length.

Loading preview...

Llama 2 13B Chat Model Overview

This model, Jeppo/Llama-2-13B-chat, is a 13 billion parameter variant from Meta's Llama 2 family of large language models. It is specifically fine-tuned for dialogue use cases and converted for the Hugging Face Transformers format. Llama-2-Chat models are noted for outperforming many open-source chat models on various benchmarks and achieving parity with some popular closed-source models in human evaluations for helpfulness and safety.

Key Capabilities

  • Dialogue Optimization: Specifically fine-tuned for assistant-like chat applications.
  • Robust Architecture: Utilizes an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for alignment with human preferences.
  • Context Length: Supports a 4k token context length, suitable for extended conversations.
  • Safety Enhancements: Demonstrates strong performance in safety evaluations, with the 13B chat model achieving 0.00% toxic generations on the ToxiGen benchmark.

Good for

  • Commercial and Research Applications: Intended for use in both commercial products and academic research.
  • English-language Chatbots: Ideal for building conversational AI agents and virtual assistants in English.
  • Natural Language Generation: Pretrained versions can be adapted for a variety of text generation tasks, while the chat version excels in interactive dialogue.
  • Benchmarking: Offers competitive performance across academic benchmarks, including Code (24.5), Commonsense Reasoning (66.9), and MMLU (54.8) for the 13B variant.