Name: nikinetrahutama/afx-ai-llama-chat-model-7 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nikinetrahutama

Model Overview

The nikinetrahutama/afx-ai-llama-chat-model-7 is a 7 billion parameter Llama-based chat model. Developed by nikinetrahutama, this model is specifically designed for conversational AI tasks, offering a balance between performance and computational efficiency.

Training Details

The model was trained using advanced quantization techniques to optimize its footprint and inference speed. Key aspects of its training procedure include:

Quantization Method: bitsandbytes was employed for quantization.
Quantization Type: It utilizes load_in_4bit: True with bnb_4bit_quant_type: nf4.
Double Quantization: bnb_4bit_use_double_quant: True was enabled, further reducing memory usage.
Compute Data Type: Training leveraged bfloat16 for computations.

These configurations indicate a focus on making the model efficient for deployment while maintaining its conversational capabilities. The training process also incorporated PEFT (Parameter-Efficient Fine-Tuning) version 0.5.0.dev0.

Intended Use

This model is suitable for various chat-based applications where a Llama-architecture foundation is desired, particularly in scenarios where efficient resource utilization through 4-bit quantization is beneficial.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)