minkhantycc/Llama-2-7b-chat-finetune-quantized
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 25, 2024License:mitArchitecture:Transformer Open Weights Cold

The minkhantycc/Llama-2-7b-chat-finetune-quantized model is a 7 billion parameter language model based on Meta's Llama-2 architecture, specifically a finetuned and quantized version of the Llama-2-7b-chat-hf model. It is designed for text generation tasks, leveraging its chat-optimized base for conversational applications. This model offers a balance of performance and efficiency due to its quantization, making it suitable for deployment in resource-constrained environments.

Loading preview...