delitante-coder/llama2-7b-chat-hf-sharded-2GB
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The delitante-coder/llama2-7b-chat-hf-sharded-2GB model is a sharded version of Meta's Llama 2 7B Chat architecture, specifically designed to have a maximum file size of 2GB. This model is a conversational language model, optimized for chat-based applications and efficient deployment in environments with file size constraints. Its primary utility lies in providing a readily deployable Llama 2 7B Chat variant for interactive text generation.

Loading preview...