nikinetrahutama/afx-issue-llama-chat-model
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The nikinetrahutama/afx-issue-llama-chat-model is a 7 billion parameter Llama-based chat model, developed by nikinetrahutama. This model was trained using 4-bit quantization (nf4) with double quantization and bfloat16 compute dtype, leveraging PEFT for efficient fine-tuning. It is optimized for chat-based applications, providing a compact yet capable solution for conversational AI tasks.

Loading preview...