4bit/Llama-2-70b-chat-hf
Llama-2-70b-chat-hf is a 69 billion parameter generative text model developed by Meta, fine-tuned for dialogue use cases. This model utilizes an optimized transformer architecture and is specifically designed for assistant-like chat applications. It leverages supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety, outperforming many open-source chat models on various benchmarks.
Loading preview...
Llama-2-70b-chat-hf Overview
This model is the 70 billion parameter variant from Meta's Llama 2 family, specifically fine-tuned for dialogue use cases. It is an auto-regressive language model built on an optimized transformer architecture. The 'chat' versions, including this one, are enhanced using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) to improve alignment with human preferences for helpfulness and safety.
Key Capabilities & Features
- Dialogue Optimization: Specifically trained for assistant-like chat interactions.
- Performance: Outperforms other open-source chat models on tested benchmarks and is competitive with some closed-source models in human evaluations for helpfulness and safety.
- Architecture: Employs Grouped-Query Attention (GQA) for improved inference scalability, a feature present in larger Llama 2 models.
- Training Data: Pretrained on 2 trillion tokens from publicly available sources, with fine-tuning data including over one million human-annotated examples.
- Context Length: Supports a context length of 4k tokens.
Intended Use Cases
- Commercial and Research: Designed for use in both commercial products and research applications.
- Assistant-like Chat: Optimized for conversational AI and chatbot development.
Limitations
- English Only: Intended for use primarily in English.
- Safety Considerations: As with all LLMs, requires developer-side safety testing and tuning for specific applications due to potential for inaccurate, biased, or objectionable responses.