4bit/Llama-2-7b-chat-hf
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jul 19, 2023Architecture:Transformer0.0K Cold

The 4bit/Llama-2-7b-chat-hf model is a 7 billion parameter, fine-tuned generative text model developed by Meta, optimized for dialogue use cases. It utilizes an optimized transformer architecture and has a context length of 4096 tokens. This model is specifically designed for assistant-like chat applications and outperforms many open-source chat models on benchmarks for helpfulness and safety. It was trained on 2 trillion tokens of publicly available data, with fine-tuning data including over one million human-annotated examples.

Loading preview...