nikinetrahutama/afx-ai-llama-chat-model-18
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The nikinetrahutama/afx-ai-llama-chat-model-18 is a 7 billion parameter Llama-based language model. It was trained using 4-bit quantization with the bitsandbytes library, specifically utilizing nf4 quantization and bfloat16 compute dtype. This model is designed for chat applications, leveraging efficient quantization techniques for deployment.

Loading preview...