Xilabs/Llama-2-7b-Sharded
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer Cold

Xilabs/Llama-2-7b-Sharded is a 7 billion parameter auto-regressive language model developed by Meta, based on the Llama 2 architecture. This specific version is sharded into smaller files (max 650 MB each) to facilitate easier loading across various devices and cloud instances. It is designed for general natural language generation tasks and can be adapted for diverse applications, with fine-tuned chat variations available for dialogue use cases.

Loading preview...