Llama 2 13B Chat Model Overview
This model, Jeppo/Llama-2-13B-chat, is a 13 billion parameter variant from Meta's Llama 2 family of large language models. It is specifically fine-tuned for dialogue use cases and converted for the Hugging Face Transformers format. Llama-2-Chat models are noted for outperforming many open-source chat models on various benchmarks and achieving parity with some popular closed-source models in human evaluations for helpfulness and safety.
Key Capabilities
- Dialogue Optimization: Specifically fine-tuned for assistant-like chat applications.
- Robust Architecture: Utilizes an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for alignment with human preferences.
- Context Length: Supports a 4k token context length, suitable for extended conversations.
- Safety Enhancements: Demonstrates strong performance in safety evaluations, with the 13B chat model achieving 0.00% toxic generations on the ToxiGen benchmark.
Good for
- Commercial and Research Applications: Intended for use in both commercial products and academic research.
- English-language Chatbots: Ideal for building conversational AI agents and virtual assistants in English.
- Natural Language Generation: Pretrained versions can be adapted for a variety of text generation tasks, while the chat version excels in interactive dialogue.
- Benchmarking: Offers competitive performance across academic benchmarks, including Code (24.5), Commonsense Reasoning (66.9), and MMLU (54.8) for the 13B variant.